Professional Documents
Culture Documents
Much research effort has been devoted in the efficacy of various im-
vey Data, two simulation studies using the data in the 1978 Income
methods for the variable Hourly Rate of Pay while the second dealt
with the imputation of the variable Quarterly Earnings. For both stud-
ies, the author stratified the data into its imputation classes, construc-
ted data sets with missing values by randomly deleting some of the re-
corded values in the original dataset and then applied the various im-
putation methods to fill in the missing values. This process was replic-
ated ten times to ensure consistency of the results. Once the imputa-
tion methods have been applied, the three measures for evaluating the
Absolute Deviation and the Root Mean Square Deviation were obtained
For the first study of imputing the variable Hourly Rate of Pay, eight
methods were used namely the Grand Mean Imputation (GM), the
Class Mean Imputation using eight imputation classes (CM8), the Class
plus a randomly chosen respondent residual (MR). Using the Mean De-
viation criteria, the results showed that all mean deviations were neg-
values. Moreover, the results show that the Grand Mean Imputation
Meanwhile for the Mean Absolute Deviation and Root Mean Square De-
the results showed that the Grand Mean Imputation fared the worst for
Imputation (MI) obtained the best measures for the two criteria and
VS. CM10, RC8 VS. RC10) yield slightly better results for the two criter-
ings, ten imputation procedures were used. These are the Grand Mean
twelve imputation classes (DI12). Using the first criteria, the Mean De-
viation, the results showed that the Grand Mean (GM) obtained a posit-
ive bias. This implied that the grand mean imputation is not an effect-
ive imputation method for the this study. The results also showed that
tion methods (CM8 and CM12) have similar measures with those of the
duced relatively small mean deviations except for the last two meth-
ods. Comparing the Mean Absolute Deviations and the Root Mean
Square Deviations, the results show that the Grand Mean Imputation
domly chosen respondent residual or MR). The results also show that
the RC8. RC12, MN and MR procedures are over one third larger com-
procedures, the author further divided the date into the deductive and
non deductive cases. This shed further light on the Mean Deviations
was found that the mean deviations are positive on the deductive case
and negative on the non deductive case for all of the procedures.
These then explains why there are relatively small deviations in the
previous results since the measures between the cases tend to cancel
out. It also showed that the DI8 and DI12 results are similar to those of
the RC8, RC12, CM8 and CM12 in the non deductive cases but are
largely different in the deductive cases. This explains the larger values
At the end of the two studies, it showed that the imputation pro-
ate the Quarterly Earnings. Moreover, it showed how the mean imputa-
the original data with the statistics obtained from the five methods
while the M-step replaces the missing values with E-step generated
namely the functional health rating. Of the 492 cases, 20% cases were
searchers used the SPSS Missing Value Analysis function for the De-
ginal data and the five methods of the imputed variable with the vari-
ables, age, gender and self assed health rating. (Musil, Warner, Yobas
& Jones, 2002) The results show that comparing the mean of the origin-
al data with the five methods, all imputed values underestimated the
mean. The closest to the original data was the Stochastic Regression,
Deletion and Mean Imputation. The same results also hold for the
Deletion and Mean Imputation. Hence, the Finding suggests that the
Imputation is the least effective. (Musil, Warner, Yobas & Jones, 2002)
simulation experiments of the Hot Deck Method. The first study fo-
than leaving the records with nonresponse out of the data set when
analyzing the variable, which is known as the Available Case Method.
This was done by constructing a fictitious data set of four values; two
rates were identified namely 5%, 10% and 20% and the simulation pro-
cess was replicated 50 times. The data set containing the missing val-
ues was first analyzed using the Available Case Method then followed
by the Hot Deck Imputation. Same with the methodology of Musil et.al.,
and the available case method also with the original and hot deck
method were computed. Based on his criteria, the results show that
Hot Deck performs better than the Available Case Method. Also, it
showed that the Hot Deck, while had closer results with the original
Hot Deck Imputation. Using the data of the Dutch Housing Demand
chosen as the variable to be imputed due to its importance and the fre-
The rationale for this choice was to ensure that the original value from
these categories will note be used as the replacements for the variable
were created once the missing values were already identified. A table
showed that in every category except for 13 and 22, which was set as
tion. This showed that the remaining records have equal probability of
becoming a donor record for an imputation and that not all imputations
give values that are near category 13 or 22. Nordholt also explored on
the Available Case Method and Hot Deck Method for this real life data.
Same with the first study, the Hot Deck fared better than the Available
tion. Using examples of how imputation is applied on the real life sur-
the imputation process due to the need of the study to be timely. The
Yu. They assessed the efficiency of the Mean Imputation versus Hot
generated an incomplete data using the Gauss Software for the im-
puted variables which were the count for cattle, hogs and chicken. In
Hot Deck Imputation Technique was better. Also, the design effect was
the Mean Imputation, since the ratio produced was less than one, they
percentage error and the variance of these percentage errors were the
basis for the precision of the estimates. The results show that the Lin-
ear Regression was the best method, followed closely by Multiple Re-
gression, then Hot Deck and finally the Mean Imputation. (Cheng and
Sy, 1999)