You are on page 1of 17

Generated by Foxit PDF Creator © Foxit Software

http://www.foxitsoftware.com For evaluation only.

5.4 Evaluation of Different Imputation Methods

To determine the effect of nonresponse rates in the results for each imputation method

(IM), evaluation of different IMs was performed. In the evaluation of the different IMs,

the results of each IM will be discussed independently. For each IM, the discussion of

results will go as follows: (1) bias of the mean of the imputed data, (2) distribution of the

imputed data using the Kolmogorov-Smirnov Goodness of Fit Test, and (3) other

measures of variability using the mean deviation (MD), mean absolute deviation (MAD),

and root mean square deviation (RMSD).

The table of results will contain the following columns: (a) variable of interest (VI), (b)

nonresponse rate (NRR), (c) the bias of the mean of the imputed data, Bias ( y ' ), (d)

percentage of correct distribution of the imputed data to the actual data set out of 1000

trials (PCD) , (e) MD, (f) MAD, and (g) RMSD.

5.4.1 Overall Mean Imputation

Table 8 shows the results of the different criteria in evaluating the imputed data using the

OMI method.

Table 8: Criteria results for the OMI method

(c)
(a) (b) (d) (e) (f) (g)
VI NRR BIAS( y ' ) PCD MD MAD RMSD
10% 640.66 0.00% -6406.60 56929.61 108547.82
TOTEX2 20% 499.43 0.00% -2497.14 59555.36 119193.32
30% -222.76 0.00% 20310.91 90396.26 271775.35
10% -597.84 0.00% 5978.39 77502.27 167206.24
TOTIN2 20% -2855.49 0.00% 14277.43 87469.87 244758.00
30% -6093.27 0.00% 742.53 62388.11 151740.94
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

1. Bias of the mean of the imputed data

In (c) of Table 8, results show that for the bias of the mean of the imputed data, as

the NRR increases, the bias for TOTEX2 slowly decreases in magnitude. The

decrease in magnitude of the respondents’ mean as NRR increase is the rationale

behind the decrease of the bias of the mean of the imputed data. As the magnitude

of the respondents’ mean decreases, variability caused by imputing a single value

(i.e. the mean of TOTEX1, the total expenditure of the first visit data, which is

equal to 105566.9) that is higher than the mean of the actual data set also

decreases.

On the other hand, the results shown for TOTIN2 are the opposite of TOTEX2 as

NRR increases. The bias of the mean of the imputed data for TOTIN2 rapidly

increases in magnitude as NRR increases. The rationale for this is the decrease in

magnitude of the respondents’ mean as NRR increases. However, unlike in

TOTEX2, the imputed values (i.e. the mean of TOTIN1, the total income for the

first visit data, which is equal to 121820.7) are much lower than the actual mean

of the data set.

2. Distribution of the Imputed Data

Results in column (e) of Table 8 showed that in all NRRs and VIs, the OMI

method failed to maintain the distribution of the actual data. This was expected

primarily because for each missing observation for the VIs, the observations were

replaced by a single value which is the overall mean of the first visit of the VIs.
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

Results from related studies that performed OMI stated that this method is one of

the worst among all IM since it distorts the distribution of the data. The

distribution of the data becomes too peaked which makes this method unsuitable

for many post-analyses. (Cheng & Sy, 1999)

3. Other measures of Variability

The three criteria in Table 8 under the columns (f), (g) and (h) show the other

measures of variability of the imputed data. The values for the MAD and RMSD

are increasing in magnitude as NRR increases for TOTEX2. The data which have

the highest percentage of imputed values have the highest values for the three

measures of variability in TOTEX2. It’s worth noting that a huge increase in

magnitude is seen in all the three criterions from the twenty to thirty percent NRR

for TOTEX2.

For TOTIN2, the data which have twenty percent imputed observations have the

highest values in all the three measures of variability. Unlike for TOTEX2,

surprisingly, values from the three measures of variability under the highest NRR

have the lowest results.

5.4.2 Hot Deck Imputation

Table 9 shows the results of the different criteria in evaluating imputed data using the hot

deck imputation (HDI3) method with three imputation classes.


Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

Table 9: Criteria results for the HDI3 method

(c)
(a) (b) (d) (e) (f) (g)
VI NRR BIAS( y ' ) PCD MD MAD RMSD
10% 491.91 100.00% 4919.40 78071.61 79251.22
TOTEX2 20% 179.42 96.90% 897.18 78292.63 67149.16
30% -606.37 0.00% -2021.19 81395.79 71390.65
10% -717.52 100.00% -7175.25 105369.15 242022.99
TOTIN2 20% -3095.41 100.00% -15477.09 111748.04 297151.50
30% -6508.65 1.00% -21695.52 115087.13 313814.92

1. Bias of the mean of the imputed data

Similar to the results in the OMI method for the TOTIN2 variable, as the NRR

increases, the bias of the mean of the imputed data rapidly increases. In the

TOTEX2 variable, the biases fluctuated as the NRR increases. For TOTEX2

and TOTIN2, the data with the highest NRR has the largest bias. For the

TOTEX2 variable, the data with twenty percent NRR provided the least bias.

On the other hand, the data with the lowest NRR yielded the smallest bias for

TOTIN2.

2. Distribution of the Imputed Data

Results in column (e) shows that in TOTIN2, the data which contained ten and

twenty percent imputation of the total number of observations, maintained the

distribution of the actual data. In TOTEX2, only the data which contained ten

percent imputations of the total number of observations maintained the

distribution of the actual data for all the one thousand data sets. In the data

which contained twenty percent imputations of the total number of

observations, 969 out of the 1000 data sets maintained the distribution of the

actual data set.


Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

For TOTEX2 and TOTIN2, the data with the highest number of imputed

observations failed to maintain the distribution of the actual data. Much

worse, none of the simulated data set for TOTEX2 registered the same

distribution as the actual. On the other hand, only a lone data set maintained

the same distribution as the actual. The researchers look into the possibility

that more than one recipient are having the same donor.

3. Other measures of variability

The three criteria in Table 9 under the columns (f), (g) and (h) show the other

measures of variability of the imputed data. For the variable TOTEX2, the

following results were obtained: (i) data that contains twenty percent imputed

value yielded the least values for the MD and RMSD, (ii) the data with the

lowest number of imputations yielded the largest value for MD and RMSD

and (iii) MAD is the only criterion which the values are increasing as NRR

increases.

For the variable TOTIN2, the following results were obtained: (i) all the three

criteria increases as NRR increases, (ii) results for the three criteria were

larger than for TOTEX2, and (iii) the data with the largest number of

imputations generated the highest value in the three criteria.


Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

5.4.3 Deterministic Regression Imputation

Table 10 shows the results of the different criteria in evaluating the imputed data using

the deterministic regression imputation method with three imputation classes (DRI3).

Table 10: Criteria results for the DRI3 method

(c)
(a) (b) (d) (e) (f) (g)
VI NRR BIAS( y ' )
PCD MD MAD RMSD
10% 536.32 100.00% 5363.47 33683.48 70553.64
TOTEX2 20% 1080.12 98.40% 5400.71 33782.60 72487.39
30% 398.39 100.00% 1328.06 32449.49 72803.60
10% 897.11 100.00% 9043.98 51363.17 106374.39
TOTIN2 20% -1815.39 100.00% -9076.98 57429.24 148278.49
30% 356.50 100.00% 1188.31 51886.73 131429.61

1. Bias of the mean of the imputed data

Looking at Table 10, column (c), the bias of the VI is increasing in magnitude as

the NRR increases for TOTEX2 and TOTIN2. Compared to OMI and HDI3

where the bias increases tremendously as NRR increases, the increase in bias for

DRI3 is much slower. The bias of the data with twenty percent NRR is just twice

the bias of the data set with ten percent NRR. For TOTEX2, this method produces

larger bias for the mean of the imputed data in all NRR than the OMI and HDI3.

2. Distribution of the Imputed Data

Contrary to the results in the OMI method under this criterion, results in column

(e) shows that the imputed data maintained the distribution of the actual data in all

NRR and VIs. It is even much better than HDI since all of the imputed data sets

under all the NRRs and VIs preserved the same distribution as the actual data. It is
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

interesting to note that the regression models that were used in this study did not

show the expected results that were mentioned in the related literature and

provided a distinct result. Earlier studies that made use of categorical auxiliary

variables, the matching variables that were transformed into dummy variables,

concluded that DRI is just the same as the mean imputation. However, in this

study, the independent variable was the first visit VIs and for each imputation

class there is a fitted model which registered a good R2.

3. Other measures of variability

The three criteria in Table 10 under the columns (f), (g) and (h) show the other

measures of variability of the imputed data. For these criteria, the following

results were obtained: First, results from the three criteria are almost stable as

NRR increases for TOTEX2 and TOTIN2. The rate of change of the values for

MD, MAD and RMSD is minimal compared to OMI and HDI3. Second, the

MAD and RMSD have smaller values than for OMI and HDI3 for TOTEX2 and

TOTIN2. Fitting models with high R2 was the key factor that made this method

better than the other two IM previously evaluated.

5.4.4 Stochastic Regression Imputation

Table 11 shows the results of the different criteria in evaluating the imputed data using

the stochastic regression imputation method with three imputation classes (SRI3).
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

Table 11: Criteria Results for the SRI3 method

(c)
(a) (b) (d) (e) (f) (g)
VI NRR BIAS( y ' ) PCD MD MAD RMSD
10% 536.32 100.00% 5363.47 33683.48 70553.64
TOTEX2 20% 1080.12 98.40% 5400.71 33782.60 72487.39
30% 398.39 100.00% 1328.06 32449.49 72803.60
10% 897.11 100.00% 9043.98 51363.17 106374.39
TOTIN2 20% -1815.39 100.00% -9076.98 57429.24 148278.49
30% 356.50 100.00% 1188.31 51886.73 131429.61

1. Bias of the mean of the imputed data

Looking at Table 11, column (c), for TOTEX2 and TOTIN2, values produced for

this method yielded much better results than for DRI3. The bias for TOTEX2 and

TOTIN2 do not follow the same scenario for the previous three method that as the

NRR increases, the bias increases. The biases fluctuate from one NRR to another.

Compared to the three previously evaluated, this method provided the least bias in

the highest NRR for both TOTEX2 and TOTIN2. While the other methods

reached a four digit bias, SRI3 generated only a three digit bias. Moreover, there

is a huge disparity in the third NRR where it only produced less than twenty

percent of the bias produced by its deterministic counterpart.

2. Distribution of the imputed data

Results from the SRI3 performed better than HDI3 which also simulated the data

1000 times. Unlike in HDI3, SRI3 maintained the same distribution for all

imputed data sets for the first and third nonresponse rates. The SRI3 also

outperformed HDI3 for the twenty percent NRR. In earlier studies, the stochastic

regression imputation performs better than any of the three methods used here.
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

The random residual was added to the deterministic predicted value to preserve

the distribution of the data.

3. Other measures of variability

The three criteria in Table 10 under the columns (f), (g) and (h) show the other

measures of variability of the imputed data. For this criteria, the following results

were obtained: First, similar to the results in measuring the bias of the mean of the

imputed data, results in TOTIN2 for all the criteria fluctuates from one NRR to

another. Second, in TOTEX2, only the RMSD criterion increase as NRR

increases while the MAD and MD fluctuates from one NRR to another. Third, the

data with the highest NRR yielded the lowest results for the MD criterion.

Fourth, for TOTIN2, the data with twenty percent NRR yielded the largest values

for the three criteria.

5.5 Distribution of the True vs. Imputed Values

To provide additional information on the distribution of the imputed data that was

discussed previously, the distribution of the true (deleted) values (TVs) and the

imputed values (IVs) from each of the IMs for all the VIs and NRRs were

obtained. Table 12, 13, and 14 shows the frequency distribution of the methods

with their corresponding relative frequencies (RFs) for the first, second, and third

NRR respectively. The RFs’ for the 1000 simulated data set from HDI3 and SRI3

were averaged. The first column represents the VIs frequency classes (FCs). This

was the same classes that were used in the Kolomogorov - Smirnov Goodness of
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

Fit Test in determining the estimated percentage of similar distributions of the

imputed data. For each NRR, the table containing the distribution of the actual

and imputed values will go as follows: (a) VIs, (b) FCs, (c) RFs of the TVs (TV),

(d) RFs of the OMI (OMI), (e) RFs of the HDI3 (HDI3), (f) RFs of the DRI3

(DRI3), and (g) RFs of the SRI3 (SRI3).

Table 12: Distribution of the TVs and IVs: 10% NRR

10% NRR
IMs
(a) (b) (c)
VI FCs TV (d) (e) (f) (g)
OMI HDI3* DRI3 SRI3*
<37869.5 10.90% 0.00% 13.90% 7.70% 9.50%
37869.5 – 47056.5 9.70% 0.00% 10.20% 8.70% 8.70%
47056.5 – 54922.0 9.70% 0.00% 9.70% 11.40% 6.10%
54922.0 – 62365.0 11.40% 0.00% 8.90% 12.30% 9.50%
63265.0 – 73868.0 8.70% 0.00% 9.10% 11.10% 11.40%
TOTEX2
73868.0 – 86103.0 9.70% 0.00% 9.40% 12.60% 11.10%
86103.0 - 101947.0 10.90% 0.00% 9.40% 8.00% 11.10%
101947.0 - 126254.5 11.10% 100.00% 8.90% 11.40% 8.50%
126254.5 - 169964.0 9.00% 0.00% 8.90% 9.00% 12.20%
>169964 8.90% 0.00% 11.60% 7.70% 12.10%

IMs
(a) (b) (c)
VI FCs TV (d) (e) (f) (g)
OMI HDI3* DRI3 SRI3*
<40570 9.70% 0.00% 15.10% 6.10% 9.10%
40570.0 – 51564.0 10.20% 0.00% 11.90% 8.70% 7.90%
51564.0 – 62006.5 9.40% 0.00% 10.10% 14.50% 8.30%
62006.5 – 73900.5 10.20% 0.00% 9.50% 10.70% 10.00%
73900.5 – 88127.0 9.00% 0.00% 9.60% 12.80% 12.40%
TOTIN2
88127.0 - 104801.0 10.90% 0.00% 9.30% 9.20% 9.00%
104801.0 - 128000.0 11.90% 100.00% 9.80% 9.90% 10.50%
128000.0 - 161669.0 11.40% 0.00% 7.80% 11.10% 9.30%
161669.0 - 233907.0 7.70% 0.00% 8.00% 10.70% 11.20%
>233907 9.90% 0.00% 8.90% 6.30% 12.30%
* RF for each class was obtained by taking the average of the 1000 simulated data set.
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

Table 13: Distribution of the TVs and IVs: 20% NRR

20% NRR
IMs
(a) (b) (c)
VI FCs TV (d) (e) (f) (g)
OMI HDI3* DRI3 SRI3*
<37869.5 9.40% 0.00% 14.30% 7.40% 8.20%
37869.5 - 47056.5 9.70% 0.00% 10.40% 9.60% 7.60%
47056.5 - 54922.0 11.60% 0.00% 9.70% 9.00% 8.20%
54922.0 - 62365.0 10.00% 0.00% 9.00% 11.00% 7.90%
63265.0 - 73868.0 9.60% 0.00% 9.20% 12.30% 10.30%
TOTEX2
73868.0 - 86103.0 8.40% 0.00% 9.40% 12.50% 11.90%
86103.0 - 101947.0 9.60% 0.00% 9.30% 9.90% 10.30%
101947.0 - 126254.5 11.30% 100.00% 8.70% 10.80% 11.80%
126254.5 - 169964.0 9.70% 0.00% 8.70% 8.80% 11.70%
>169964 10.70% 0.00% 11.30% 8.70% 12.10%

IMs
(a) (b) (c)
VI FCs TV (d) (e) (f) (g)
OMI HDI3* DRI3 SRI3*
<40570 10.00% 0.00% 15.70% 4.80% 11.80%
40570.0 - 51564.0 10.30% 0.00% 12.10% 11.90% 12.20%
51564.0 - 62006.5 11.70% 0.00% 10.10% 10.20% 11.30%
62006.5 - 73900.5 10.20% 0.00% 9.60% 11.70% 9.90%
73900.5 - 88127.0 8.60% 0.00% 9.50% 11.90% 8.50%
TOTIN2
88127.0 - 104801.0 9.40% 0.00% 9.30% 9.60% 10.10%
104801.0 - 128000.0 9.10% 100.00% 9.70% 11.70% 9.00%
128000.0 - 161669.0 9.20% 0.00% 7.60% 9.80% 8.30%
161669.0 - 233907.0 11.30% 0.00% 7.80% 9.70% 8.90%
>233907 10.20% 0.00% 8.70% 8.70% 10.10%
* RF for each class was obtained by taking the average of the 1000 simulated data set.
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

Table 14: Distribution of the TVs and IVs: 30% NRR

30% NRR
IMs
(a) (b) (c)
VI FCs TV (d) (e) (f) (g)
OMI HDI3* DRI3 SRI3*
<37869.5 9.80% 0.00% 14.30% 7.80% 10.30%
37869.5 - 47056.5 8.80% 0.00% 10.40% 9.00% 9.60%
47056.5 - 54922.0 9.60% 0.00% 9.70% 9.40% 8.30%
54922.0 - 62365.0 9.50% 0.00% 8.90% 10.80% 9.30%
63265.0 - 73868.0 11.00% 0.00% 9.20% 12.70% 10.10%
TOTEX2
73868.0 - 86103.0 10.70% 0.00% 9.40% 11.50% 10.60%
86103.0 - 101947.0 10.70% 0.00% 9.40% 12.10% 9.80%
101947.0 - 126254.5 9.40% 100.00% 8.70% 8.80% 10.10%
126254.5 - 169964.0 11.00% 0.00% 8.70% 9.00% 8.10%
>169964 9.50% 0.00% 11.30% 9.00% 13.70%

IMs
(a) (b) (c)
VI FCs TV (d) (e) (f) (g)
OMI HDI3* DRI3 SRI3*
< 40570 9.40% 0.00% 15.60% 6.50% 8.90%
40570.0 - 51564.0 9.00% 0.00% 12.10% 10.40% 8.20%
51564.0 - 62006.5 9.90% 0.00% 10.10% 10.80% 8.80%
62006.5 - 73900.5 10.70% 0.00% 9.60% 11.50% 10.10%
73900.5 - 88127.0 10.20% 0.00% 9.50% 12.20% 11.00%
TOTIN2
88127.0 - 104801.0 10.30% 0.00% 9.30% 10.70% 10.20%
104801.0 - 128000.0 10.30% 100.00% 9.70% 10.50% 10.40%
128000.0 - 161669.0 9.80% 0.00% 7.60% 11.20% 10.80%
161669.0 - 233907.0 10.70% 0.00% 7.70% 8.20% 10.30%
>233907 9.90% 0.00% 8.70% 8.00% 11.30%
* RF for each class was obtained by taking the average of the 1000 simulated data set.
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

In all NRR, the results clearly illustrate the distortion of the distribution. Since the OMI

method assigns the mean of the first visit VI to all the missing cases, all the data sets

concentrated in one particular frequency class. The three other methods which

implemented imputation classes, gave a better outcome than OMI by spreading the

distribution of the imputed data.

For the HDI method, in all nonresponse rates, most of the imputed observations clustered

in the first frequency class, that is less than 37859.5 for TOTEX2 and 40570.0 for

TOTIN2. The clustering was also formed for the first and third nonresponse rate in last

frequency class for TOTEX2 and for the all nonresponse rates in second frequency class

for TOTIN2. The percentage of the data from the lowest class for TOTEX2 and TOTIN2,

for all nonresponse rate ranges from 14-16% as compared to the actual percentage which

only ranges from 9-11%.

While there is an over representation of the data for HDI3, an under representation was

observed from the interval 86103-126254.5 for the 10% and 20% nonresponse imputed

data sets respectively and from the interval 63265-101947 for the 30% nonresponse

imputed data sets. The percentage from the interval indicated for the 10% and 20% under

the actual data totaled about 30% while the imputed data only totaled less than 30%.

For the two regression imputation methods, unlike hot deck and OMI which had major

cluster, produced more spread distribution although there are some areas that are under
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

represented. The failure to consider a random residual term in deterministic regression

resulted into a severe under representation of the data in particular the first frequency

class. On the other hand, the SRI which considered a random residual provided better

results than DRI. However, there are some areas that the added random produced

significant excess mostly from the last frequency class.

5.6 Choosing the best imputation method

For this section, the rankings of all the tests are the basis to determine which of the

following IMs will be chosen as the best IMs for this particular study and data. The

selection of the best method will be independent for all VIs and NRRs. The ranking are

based on a four-point system wherein the rank value of 4 denotes the worst IM for that

specific criterion and 1 denotes the best IM for that criterion. In case of ties, the average

ranks will be substituted. The IM with the smallest rank total will be declared the best IM

for the particular VI and NRR. The ranking of IM will cover the following criteria: (a)

Bias of the mean of the imputed data (N.B.), (b) percentage of correct distributions

(PCD), and (c) Other measures of variability, namely, MD, MAD and RMSD. All in all,

there are five criteria that each IM will be rank in.

Tables 15, 16 and 17 show the ranking of the different imputation methods for the 10%,

20% and 30% NRR respectively. For each NRR, the table containing the rankings of the

IMs will go as follows: (a) VIs, (b) Criteria, (c) OMI, (d) HDI3, (e) DRI3, and (f) SRI3.
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

Table 15: Ranking of the Different IMs: 10% NRR

10% NRR
IMs
VI CRITERIA
OMI HDI3 DRI3 SRI3
N.B. 3 1 4 2
PCD 4 1.3 1.3 1.3
MD 3 1 4 2
TOTEX2
MAD 3 4 1 2
RMSD 4 3 1 2
TOTAL 17 10.3 11.3 9.3
Category Rank 4th 2nd 3rd 1st

IMs
VI CRITERIA
OMI HDI3 DRI3 SRI3
N.B. 1 2 4 3
PCD 4 1.3 1.3 1.3
MD 1 2 4 3
TOTIN2
MAD 3 4 1 2
RMSD 3 4 1 2
TOTAL 12 13.3 11.3 11.3
Category Rank 3rd 4th 1st 1st

Table 16: Ranking of the Different IMs: 20% NRR

20% NRR
IMs
VI CRITERIA
OMI HDI3 DRI3 SRI3
N.B. 2 1 4 3
PCD 4 3 1 2
MD 2 1 4 3
TOTEX2
MAD 3 4 1 2
RMSD 4 2 1 3
TOTAL 15 11 11 13
Category Rank 4th 1st 1st 3rd

IMs
VI CRITERIA
OMI HDI3 DRI3 SRI3
N.B. 3 4 2 1
PCD 4 1.3 1.3 1.3
MD 3 4 2 1
TOTIN2
MAD 3 4 1 2
RMSD 3 4 1 2
TOTAL 16 17.3 7.3 7.3
Category Rank 3rd 4th 1st 1st
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

Table 17: Ranking of the different IMs: 30% NRR

30% NRR
VI CRITERIA IMs
OMI HDI3 DRI3 SRI3
N.B. 1 3 4 2
PCD 4 3 1.5 1.5
MD 1 3 4 2
TOTEX2
MAD 3 4 1 2
RMSD 4 2 1 3
TOTAL 13 15 11.5 10.5
Category Rank 3rd 4th 2nd 1st

IMs
VI CRITERIA
OMI HDI3 DRI3 SRI3
N.B. 3 4 2 1
PCD 4 3 1.5 1.5
MD 3 4 2 1
TOTIN2
MAD 3 4 1 2
RMSD 3 4 1 2
TOTAL 16 19 7.5 7.5
Category Rank 3rd 4th 1st 1st

Rankings show that the two regression IMs provided better results than their model-free

counterparts. For all the nonresponse rates under the TOTIN2 variable, the two regression

imputation methods tied as the best IM, and surprisingly the HDI finished the worst IM

behind OMI. Under the TOTEX2 variable, mixed rankings were seen for all nonresponse

rates. The regression methods still provided good results. The SRI method finished first

in the 10% and 30% NRR and ranked third in the 20% NRR while the DRI method

finished third, first and second in the 10%, 20% and 30% NRR respectively. While the

HDI was seen as the worst IM for TOTIN2, the OMI was concluded the worst IM for

TOTEX2 by ranking last for both 10% and 20% NRR and third for the 30% NRR.
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.

In conclusion, the best imputation method for this study is the SRI3 using the 1997 FIES

data. It is very closely followed by the DRI3 method. No records in the results show that

SRI3 method ranked last in all the criteria, NRRs and VIs, unlike for DRI3 which

provided the worst IM in the bias of the mean of the imputed data and MD criteria. The

researchers selected the HDI3 as the worst IM in this study. The HDI3 method fared the

worst in most of the criteria in particular to the other measures of variability in the 20%

and 30% NRR.

You might also like