Professional Documents
Culture Documents
Note that this integral corresponds to the area below the curve g(x).
One easy way to approximate the value of this integral is to partition interval [a, b] into thin slices of
width and approximate the area of each slice with g(xi ). This is shown in Figure 1. In other words,
1
Rb
Figure 1: a g(x)dx calculates the area below the curve g. Numerical methods partition the interval [a, b] into thin
slices and then approximate the area of each thin slice.
Z b n
X
g(x)dx g(xi ),
a i=1
where xi is the midpoint in the ith slice, and n = d ba e . Clearly, this approximation suffers from a certain
amount of error. Using the mean value theorem we can provide an upper bound for the amount of error.
Also, it is intuitively clear that as gets smaller, the number of slices n increases and the error decreases.
Lets formalize this statement by providing an upper bound for the approximation error. Suppose g is
differentiable and |g 0 (x)| < C for all values of x, where C is a constant. Using the mean value theorem we
have
C
|g(x) g(xi )| , x xi , xi + .
2 2 2
Can you prove why? Therefore, we have
Z Z
xi +/2 xi +/2 C2
g(x)dx g(xi ) = (g(x) g(xi ))dx . (1)
2
xi /2 xi /2
Try to describe why each step is correct. Using (1), it is straightforward to prove that
Z
b n
X C(b a) C(b a)2
g(x)dx g(xi ) . (2)
2 2n
a
i=1
Again, try to prove this last step. Remember that in numeric integration usually n is very large, and similar
to what we had in regression the rate of decay of the error, as n grows, is an important goodness measure
for a numeric integration method. The simple method we described above gives us n1 rate as shown in (2).
The simple approach we described above can be extended to higher dimensional functions. Consider a
function g : Rd R and suppose that we are interested in the numeric evaluation of
Z b Z b Z b
... g(x1 , x2 , . . . , xd )dx1 dx2 . . . dxd .
a a a
Note that for notational simplicity we have assumed that we are interested in the integral over [a, b]d .
But all the discussions are true for more general forms of intervals.
2
Figure 2: Depiction of integration in dimensions greater than 1.
Then, again, if we would like to perform this integration, we should break [a, b]d into subintervals and
approximate the integral with
n
X Z b Z b Z b
g(xi1 , xi2 , . . . , xid )d ... g(x1 , x2 , . . . , xd )dx1 dx2 . . . dxd .
i=1 a a a
Figure 2 depicts the partitions we consider for the numeric integration. However, as you can imagine,
as the dimension goes up, we should consider many more intervals to achieve a certain level of accuracy. In
fact under some minimal conditions, such as the boundedness of the gradient of the function, one can prove
that Z Z
b b Z b n
X
i i i
d C0
... g(x1 , x2 , . . . , xd )dx1 dx2 . . . dxd g(x1 , x2 , . . . , xd ) 1 , (3)
a a a nd
i=1
where C 0 is constant that depends on the maximum size of the partial derivatives and the interval of
integration. But it is free of n. We will prove this statement later. As is clear from (3) as the dimension
increases the rate of decay of the error decreases. This means that we need more number of slices to
obtain certain level of accuracy. As the number of points (slices) increases, the computational complexity
of integration increases. Is it clear why? Therefore, the numeric integration becomes dramatically slower.
Let me give you an example. Assume for a moment that C 0 = 1 and we want to evaluate my integral with
error=0.01. Then we have
1 2d
1 0.01 n 10 .
nd
This means that if you want to calculate a numerical integral of a 10 dimensional function, then you have
to calculate the function at 1020 points!!!
Those of you have tried matlab and R numeric integration have noticed this problem. It is almost
impossible to get any result from them when the dimension is larger than 5. Now, lets try to prove the
upper bound for the d-dimensional numeric integration. Suppose that function g satisfies the following
property:
g(x1 , x2 , . . . , xd )
max sup C. (4)
i=1,2,...,d x1 ,x2 ,...,xd xi
3
We would like to calculate the integral
Z bZ b Z b
... g(x1 , x2 , . . . , xd )dx1 dx2 . . . dxd .
a a a
We partition the integration interval into n equal-volume cubes of size d , where nd = (b a)d . Call
the center of the ith cube xi1 , xi2 , . . . , xid . We then have
Theorem 1. Let g be d-variate function that satisfies (4). Then
Z Z n
b b Z b X Cd(b a)d+1
i i i d
... g(x1 , x2 , . . . , xd )dx1 dx2 . . . dxd g(x1 , x2 , . . . , xd ) .
2n1/d
a a a
i=1
Proof
Z b Z b Z b n
X
| ... g(x1 , x2 , . . . , xd )dx1 dx2 . . . dxd g(xi1 , xi2 , . . . , xid )d |
a a a i=1
n Z xi1 +
2
Z xid +
2
n
(a) X X
= | ... g(x1 , x2 , . . . , xd )dx1 dx2 . . . dxd g(xi1 , xi2 , . . . , xid )d |
i=1 xi1
2 xid
2 i=1
n Z xi1 +
2
Z xid +
2
(b) X
= | ... (g(x1 , x2 , . . . , xd ) g(xi1 , xi2 , . . . , xid ))dx1 dx2 . . . dxd |
i=1 xi1
2 xid
2
(c) n Z
X xi1 +
2
Z xid +
2
... |(g(x1 , x2 , . . . , xd ) g(xi1 , xi2 , . . . , xid ))|dx1 dx2 . . . dxd . (5)
i=1 xi1
2 xid
2
Therefore, we should find an upper bound for |g(x1 , x2 , . . . , xd ) g(xi1 , xi2 , . . . , xid )|. We do it in the following
way:
|g(x1 , x2 , . . . , xd ) g(xi1 , xi2 , . . . , xid )|
(d)
= |g(x1 , x2 , . . . , xd ) g(xi1 , x2 , x3 , . . . , xd ) + g(xi1 , x2 , x3 , . . . , xd ) g(xi1 , xi2 , . . . , xid )|
(e)
|g(x1 , x2 , . . . , xd ) g(xi1 , x2 , x3 , . . . , xd )| + |g(xi1 , x2 , x3 , . . . , xd ) g(xi1 , xi2 , . . . , xid )|
(6)
It it straight forward to use the mean value theorem to bound the first term above. However, the second
term still looks complicated. Therefore, we use the same technique to simplify the second term:
|g(xi1 , x2 , x3 . . . , xd ) g(xi1 , xi2 , . . . , xid )|
= |g(xi1 , x2 , x3 . . . , xd ) g(xi1 , xi2 , x3 . . . , xd ) + g(xi1 , xi2 , x3 . . . , xd ) g(xi1 , xi2 , . . . , xid )|
|g(xi1 , x2 , x3 . . . , xd ) g(xi1 , xi2 , x3 . . . , xd )| + |g(xi1 , xi2 , x3 . . . , xd ) g(xi1 , xi2 , . . . , xid )|. (7)
If we repeat this process we can easily provide the following upper bound:
|g(x1 , x2 , . . . , xd ) g(xi1 , xi2 , . . . , xid )|
(f )
|g(x1 , x2 , . . . , xd ) g(xi1 , x2 , x3 . . . , xd )| + |g(xi1 , x2 , x3 . . . , xd ) g(xi1 , xi2 , x3 . . . , xd )| + . . .
(g) Cd
+|g(xi1 , xi2 , . . . , xid1 , xd ) g(xi1 , xi2 , . . . , xid )| . (8)
2
If we combine (5) with (8) we obtain
Z bZ b Z b n
X
| ... g(x1 , x2 , . . . , xd )dx1 dx2 . . . dxd g(xi1 , xi2 , . . . , xid )d |
a a a i=1
d+1 d d+1
nCd Cd(b a) Cd(b a)
= = . (9)
2 2 2n1/d
4
In the next section we will see that in certain cases we can solve this problem more efficiently by the
Monte Carlo method.
1 Note that we are ignoring a subtle point here: the variance of the function g usually grows for higher dimensional functions.
Therefore, in practice, Monte Carlo methods are not necessarily faster for high dimensional functions. But, if the variance of g
is not too large, and we can easily obtain an estimate of a variance of g from our samples, then Monte Carlo methods can be
efficient.