Comparison Of Imputation Methods For Missing Values In Longitudinal Data
Date
2017-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Ghana
Abstract
Longitudinal data are common in various sectors where repeated measurements
on a dependent variable are collected for all subjects. Missing data pattern are
caused when most planned measurements are unavailable for some subjects. The
dropout process may cause three missing values mechanism, namely: Missing
Completely at Random (MCAR), Missing at Random (MAR), and Missing Not
at Random (MNAR). The missing values have influence on quantitative study
that can be serious, leading to biased estimates of parameters, information loss,
reduced statistical power, increased standard errors, and weakened generalization
of findings. This thesis compared the performance of seven (7) techniques of
imputing missing values under the assumptions of MCAR and MAR mechanisms.
The study adopted the little’s test to check whether a dataset with missing
values is MCAR or MAR. The techniques for solving missing values problems
were compared using the Generalized Estimating Equation (GEE) model for the
complete dataset, the coefficient of determination and root mean squared error
(RMSE). The study discovered that when large (above 10%) or small (below
10%) values are missing at random (MAR), it is important to use multiple
imputation or expectation maximization to replace missing values in the dataset.
The pairwise deletion is the best under MCAR mechanism. Listwise deletion and
the hot deck imputation methods performed poorly under the MCAR mechanism.
It is recommended that researchers should understand the patterns of missing
values in dataset and clearly recognize missing data problems and the situations
under which they occurred. However, further research is needed to find a better
method for imputing missing not at random (MNAR) with multiple imputation.
This thesis focused on missing values in a longitudinal dataset. However, future
research using categorical data is a step in right the direction.
Description
Thesis (MPhil)
Keywords
Comparison, Imputation Methods, Missing Values, Longitudinal Data