Professional Documents
Culture Documents
/ÌiÊ6À>>Ê >>Ê
Java
Performance Evaluation
ÕVÌi
MarkSweep
GenCopy
CopyMS
SemiSpace
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
GenMS
Whetting the appetite with db
Best run out of 30
we conclude there is 12.5
MarkSweep
GenCopy
CopyMS
SemiSpace
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
GenMS
Whetting the appetite with db
Best run out of 30
we conclude there is 12.5
MarkSweep
GenCopy
CopyMS
SemiSpace
GenMS
faster than GenCopy
12.0
11.5
11.0 12.5
interval for 30 runs
SemiSpace
GenMS
11.0
10.5
10.0
9.5
9.0
MarkSweep
GenCopy
CopyMS
SemiSpace
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
GenMS
Whetting the appetite with db
Best run out of 30
12.5
95% confidence
execution time (s)
12.0
11.5
11.0 12.5
interval for 30 runs
SemiSpace
GenMS
11.0
10.5
10.0
MarkSweep
GenCopy
CopyMS
SemiSpace
GenMS
GenCopy and
SemiSpace do not
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Non-determinsm is the Figuur moven en scalen
en uitleggen wat het
niet-determinisme
problem
veroorzaakt.
1.05
1.00 ! ! ! ! !
! ! ! ! ! !
! ! !
0.95
javac
mpegaudio
luindex
compress
jess
db
mtrt
jack
antlr
bloat
fop
hsqldb
jython
pmd
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Contributions
Pitfall associated with current prevalent
data analysis techniques
We advocate a statistically rigorous Java
performance evaluation
Define approaches for both start-up and
steady-state and provide a tool to
automate this:
http://www.elis.ugent.be/JavaStats
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Current situation
Fixed heap
With JIT size?
Experimental
(re)compilation? Design
Number of VM
Number of heap sizes
hardware platforms Application
input size
Data
Analysis
Mean of n runs Median of n runs
Second best
Confidence
of n runs
interval of n runs
Data
Analysis
Mean of n runs Median of n runs
Second best
Confidence
of n runs
interval of n runs
s
x̄ ± t α
1− 2 ;p−1 √
p
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Dealing with steady-state
CoV
<δ
1 !
si
x̄i = xij
k
j=si −k
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Dealing with steady-state
1 !
si
x̄i = xij
k
j=si −k
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Dealing with steady-state
1 !
si
x̄i = xij
k
j=si −k
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Dealing with steady-state
x̄1
We have p mean values, one per
invocation
x̄2
x̄3
We compute the confidence
interval for their mean
x̄4
... 1
!p s
p i=1 x̄i ± t1− α2 ;p−1 √p
x̄p−1
x̄p
non-overlapping misleading
correct
interval, same order but correct
non-overlapping
misleading
interval, different incorrect
and correct
order
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Comparison categories
Statistical approach Prevalent methodology
ANOVA +PrevalentperformanceRigorous
performance
confidence intervals difference < θ difference ≥ θ
non-overlapping misleading
correct
interval, same order but correct
non-overlapping
misleading
interval, different incorrect
and correct
order
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Experimental setup
AMD Athlon XP @ 2.1GHz, 2 GiB RAM,
Linux 2.6.18, idle
Jikes RVM svn head of February 12 2007
5 GCs from MMTk: CopyMS, GenCopy,
GenMS, MarkSweep, SemiSpace
SPECjvm98 and DaCapo
Minimal heap up to 6 times as much
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
10 invocations on Athlon
decision threshold θ=1%
incorrect misleading misleading and incorrect
misleading but correct indicative
percentage of all comparisons
25
SPECjvm98 DaCapo
20
15
10
5
0
st
an
st
st
st
an
st
st
ea
ea
be
be
or
be
be
or
i
i
ed
ed
m
m
w
w
nd
nd
m
m
co
co
se
se
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Athlon GenMS vs. other
GCs, 10 invocations, θ=1%
incorrect misleading misleading and incorrect
misleading but correct indicative
percentagee of all comparisons
30
SPECjvm98 DaCapo
25
20
15
10
5
0
st
an
st
st
st
an
st
st
ea
ea
be
be
or
be
be
or
i
i
ed
ed
m
m
w
w
nd
nd
m
m
co
co
se
se
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Raise the θ threshold?
javac, best-of-30, θ [0;3]
incorrect misleading misleading and incorrect
misleading but correct indicative
70
percentage of all comparisons
60
50
40
30
20
10
0
0 1 2 3
θ-threshold
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
percentage of all comparisons
0
10
20
30
40
50
best median of (3,10)
best of (3,10)
best of (3,30)
,
/
Ê88888888888
best of (5,10)
best of (5,30)
misleading and incorrect
invocations,10/30 iterations
10
width as percentage of the mean
0
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
number of VM invocations
OOPSLA - October 23 2007 - Montréal
,
/
Ê88888888888
Conclusion