You are on page 1of 14

Kernel Smoothing

When approximating probabilities of losses from a continuous distribution, it is better to use


a continuous estimator rather than the empirical distribution. In fact, to improve the discrete
estimation, we may correct the discrete empirical distribution via the method of smoothing.
To start with, assume n observed values {x1 , ... , xn }. For each xi we choose a continuous
density function kxi , and let Kxi be the corresponding CDF. Then the smoothed density
function is


(kernel smoothed density function) fˆ(x) = ni=1 n1 kxi (x)

This is indeed a density function. The smoothed distribution function is

∑n 1
(kernel smoothed distribution function) F̂ (x) = i=1 n Kxi (x)

This is indeed a density function. The corresponding distribution function is called kernel
smoothed distribution.

Two most commonly used kernels are the uniform kernel and the triangular kernel.

1
1. Uniform Kernel

A uniform kernel is a uniform distribution over some interval [x∗ − b , x∗ + b]. The number b is
called the bandwidth. Its density function is:

 1
x ∈ [x∗ − b , x∗ + b]
2b
kx∗ (x) =
 0 otherwise

By integrating w.r.t. x we get its distribution function :




 0 x ≤ x∗ − b








x−(x∗ −b)
Kx∗ (x) = x ∈ [x∗ − b , x∗ + b]

 2b







 1 x∗ + b ≤ x

But note that the expression x ∈ [x∗ − b , x∗ + b] is equivalent to |x − x∗ | ≤ b. So we can


rewrite:


 0 x ≤ x∗ − b








x−(x∗ −b)
Kx∗ (x) = |x − x∗ | ≤ b

 2b







 1 x∗ + b ≤ x

This can be equivalently written as:




 0 x + b ≤ x∗








x+b−x∗
Kx∗ (x) = |x − x∗ | ≤ b

 2b







 1 x∗ ≤ x − b

2


 1 x∗ ≤ x − b








the right end-point−x∗
Kx∗ (x) = x − b ≤ x∗ ≤ x + b

 length of the interval







 0 x + b ≤ x∗

Example. Consider the observations

600 , 615 , 618 , 620 , 637 , 637 , 645 , 675 , 685 , 690

Use the smoothed kernel with uniform kernel of bandwidth 15 to estimate the distribution.
Then what are the estimates fˆ(635) and F̂ (635) ?.

Solution.
1 1
The kernel density function has the value 2b = 30 . There are 4 points in the interval
[635 − 15 , 635 + 15] = [620 , 650]. So

1 number of observations in [620 , 650] 4


fˆ(635) = = = 0.013
30 10 300

To calculate F̂ (635) : we locate the interval [620 , 650] on the real line and then put the data
points on the real line. The points 600, 615, and 618 are below the interval , so each get the
value 1. The four points that are inside the interval get these values:


 600 ⇒ 1





 615 ⇒ 1




 618 ⇒ 1 151 1 151
⇒ Total = ⇒ Multiply by ⇒ F̂ (635) = = 0.5033

 620 ⇒ 30 30 10 300

 30



 637 ⇒ 2( 30
13

 )



645 ⇒ 5
30

3
The graph of any uniform-kernel-smoothed distribution function is:

1
slope = nb

Note. If the observed values {x1 , ... , xn } have different probabilities p(xi ) than 1
n then the
kernel-smoothed density is


n
fˆ(x) = p(xi ) kxi (x)
i=1

Example (exercise 12.30 of the textbook). You are given the following ages at time of
death for 10 individuals: individuals

25 , 30 , 35 , 35 , 37 , 39 , 45 , 47 , 49 , 55

Using a uniform kernel with a bandwidth of b = 10, determine the kernel density estimate of

4
the probability of survival to age 40.

Solution. We need to find Ŝ(40). For this we calculate F̂ (40) first. The kernel density
function has the value 1
2b = 1
20 . There are 8 points in the interval [40 − 10 , 40 + 10] = [30 , 50].

We then locate the interval [30 , 50] on the real line and then put all data points on the real
line. The points 25 of the data set is below this interval , so it gets the value 1. The seven
points that are below or inside the interval get these values:

5


 25 ⇒

 1





 30 ⇒ 20


20



 35 ⇒ 2( 15
20 )



 37 ⇒ 13
20 103 1 103
⇒ Total = ⇒ Multiply by ⇒ F̂ (40) = = 0.515

 39 ⇒ 11 20 10 200

 20


 45 ⇒
 5




20



 47 ⇒ 3


20

 49 ⇒ 1
20

Ŝ(42) = 1 − 0.515 = 0.485 ✓

Note. This data set could have been given in the following format:

tj sj rj
25 1 10
30 1 9
35 2 8
37 1 6
39 1 5
45 1 4
47 1 3
49 1 2
55 1 1

Sometimes, this table is given to us instead. See the following example:

Example (exercise 12.29 of the textbook) ∗. You are given the data in Table below on
time to death. Using the uniform kernel with a bandwidth of 60, determine fˆ(100).

6
tj sj rj
10 1 20
34 1 19
47 1 18
75 1 17
156 1 16
171 1 15

Solution. This data is complete for the part given. Each event has the probability of
1
occurrence equal to 20 .

1 1
The kernel density function has the value 2b = 120 . There are 3 points in the interval
[100 − 60 , 100 + 60] = [40 , 160]. So

1 number of observations in [40 , 160] 3


fˆ(100) = = = 0.0013 ✓
120 20 2400

Note. In this example, the ratio

number of observations in [40 , 160]


20

is nothing but the P (X ≤ 160) − P (X < 40) = F (160) − F (40− ). In some examples the
underlying distribution might be different from the empirical distribution in which case we use
the appropriate F to calculate a difference like F (160) − F (40− ) ; see the next example.

Example. In the previous example suppose that the Kaplan-Meier estimation of S(t) is used
(to calculate the probabilities). Answer the same question.

Solution.

We have:

7
P (40 ≤ X ≤ 160) = F (X ≤ 160) − F (X < 40) = F (X ≤ 156) − F (X ≤ 34) = F (156) − F (34)
( )( ) ( )( ) ( )
19 18 19 18 15 18 15 3
= S(34) − S(156) = − ··· = − = = 0.15
20 19 20 19 16 20 20 20

Then:
( )
1 1
fˆ(100) = P (40 ≤ X ≤ 160) = (0.15) = 0.0013 ✓
120 120

Note. In the professional exam when there is no mention of probabilities, then you should use
1
the empirical probabilities, which are all equal to n.

8
2. Triangular Kernel

Definition. The triangular kernel (density, or distribution) is



 b−|x−x∗ |
x ∈ [x∗ − b , x∗ + b]
b2
kx∗ (x) =
 0 otherwise

Note. kx∗ (x) is symmetric, i.e. kx∗ (x) = kx (x∗ ).

Example (from the Finan’s study guide). You are given the following ages at time of
death of 10 individuals:

25 , 30 , 35 , 35 , 37 , 39 , 45 , 47 , 49 , 55

Using a triangular kernel with bandwidth 10, Find the kernel smoothed density estimate
fˆ(40).

Solution. The base of triangle based at x = 40 is the interval [40 − 10 , 40 + 10] = [30 , 50]. So:

 10−|x−40|
x ∈ [30 , 50]
100
k40 (x) =
 0 otherwise

From the data set, only the following values fall in the interval [30 , 50]:

30 , 35 , 35 , 37 , 39 , 45 , 47 , 49

These eight points that are inside the interval get these values:

9


 30 ⇒ 0





 35 ⇒ 2( 100
5
)





 37 ⇒ 7

 100
35 1
39 ⇒ 9 ⇒ Total = ⇒ Multiply by ⇒ fˆ(42) = 0.035 ✓

 100 100 10



 45 ⇒ 5

 100



 47 ⇒ 3

 100


49 ⇒ 1
100

By integrating kx∗ (x) w.r.t. x we get its distribution function :




 0 x ≤ x∗ − b









 (b−|x−x∗ |)2

 x ∈ [x∗ − b , x∗ ]

 2b2
Kx∗ (x) =



 (b−|x−x∗ |)2

 1− x ∈ [x∗ , x∗ + b]

 2b2








1 x∗ + b ≤ x

Equivalently:


 1 x∗ ≤ x − b









 (b−|x−x∗ |)2

 1− x∗ ∈ [x − b , x]

 2b2
Kx∗ (x) =



 (b−|x−x∗ |)2

 x ∈ [x , x + b]

 2b2








0 x + b ≤ x∗

Example . In the previous example, find the triangular kernel smoothed density estimate
F̂ (40).

10
Solution. The base of triangle based at x = 40 is the interval [30 , 50]. So:


 1 x∗ ≤ 30









 (10−|40−x∗ |)2

 1− x∗ ∈ [30 , 40]

 200
Kx∗ (40) =



 (10−|40−x∗ |)2

 x ∈ [40 , 50]

 200








0 50 ≤ x∗

From the data set, only the following values fall in the interval [30 , 50] or are on the
right-hand side of this interval:

30 , 35 , 35 , 37 , 39 , 45 , 47 , 49 , 55



 1 x∗ ≤ 30









 (10−|40−x∗ |)2

 1− x∗ ∈ [30 , 40]

 200
Kx∗ (40) =



 (10−|40−x∗ |)2

 x ∈ [40 , 50]

 200








0 50 ≤ x∗

These nine points get these values:

11


 30 ⇒ 1





 35 ⇒ 2(1 − 20025
)





 37 ⇒ 1 − 20049

 855 1
39 ⇒ 1 − 20081 ⇒ Total = ⇒ Multiply by ⇒ F̂ (40) = 0.4275 ✓

 200 10



 45 ⇒ 25

 200



 47 ⇒ 9

 200


49 ⇒ 1
200

12
3. Calculating Mean and Variance of Kernel-Smoothed
Distributions

Example . Assume the following data set:

11 , 16 , 19 , 21

Suppose that we smooth the data with a uniform bandwidth of 2. Calculate the mean and
variance of the smoothed distribution.

Solution . Let X denote the smoothed random variable, and let Y be the discrete random
1
variable having the empirical probability 4 assigned to the observations. The conditional
random variable (X | Y = 11) is just a uniform distribution with center of it being 11,
therefore (X | Y = 11) = 11. This argument holds for all other three observations as well,
therefore we have E(X|Y ) = Y . Therefore:

11 + 16 + 19 + 21
E(X) = E[E(X|Y )] = E(Y ) = = 16.57 ✓
4

(this equality E(X) = E(Y ) holds for all kernels not just for the uniform kernel, because of
symmetry)

112 + 162 + 192 + 212


E(Y 2 ) = = 294.75
4

Var(Y ) = E(Y 2 ) − E(Y )2 = 294.75 − 16.572 = 20.1851

Note that for the observation 11,

42 4
Var(X|Y = 11) = Var(uniform on an interval of length 4) = =
12 3

And this holds equally well for all other observations. So the conditional random variable

13
Var(X|Y ) is the constant 43 , therefore its mean is also 43 . So, E[Var(X|Y )] = 43 . On the other
hand,

Var[E(X|Y )] = Var(Y ) = 20.1851

So then:

4
Var(X) = Var[E(X|Y )] + E[Var(X|Y )] = 20.1851 + = 21.5184 ✓
3

Second method (a longer method). We may use the density function of the smoothed
density to calculate the mean and variance of X. The smoothed density function is this step
function:

x (−∞ , 9) (9 , 13) (13 , 14) (14 , 17) (17 , 18) (18 , 19) (19 , 21) (21 , 23) (23 , ∞)
1 1 2 1 2 1
f (x) 0 16 0 16 16 16 16 16 0

(it takes sometime to complete the details of this table).


{∫ 13 ∫ 17 ∫ 18 ∫ 19 ∫ 21 ∫ 23 }
1
E(X) = x dx + x dx + 2x dx + x dx + 2x dx + x dx
16 9 14 17 18 19 21

{[ ]13 [ ]17 [ ] [ ] }
1 1 2 1 [ 2 ]18 1 2 19 [ 2 ]21 1 2 23
= x + x2 + x 17 + x + x 19 + x =?
16 2 9 2 14 2 18 2 21

{∫ 13 ∫ 17 ∫ 18 ∫ 19 ∫ 21 ∫ 23 }
2 1 2 2 2 2 2 2
E(X ) = x dx + x dx + 2x dx + x dx + 2x dx + x dx
16 9 14 17 18 19 21

{[ ]13 [ ]17 [ ]18 [ ]19 [ ]21 [ ]23 }


1 1 3 1 2 1 2 1
= x + x3 + x3 + x3 + x3 + x3 = ···
16 3 9 3 14 3 17 3 18 3 19 3 21

Var(X) = E(X 2 ) − E(X)2 = · · ·

14

You might also like