Abstract:
Malaria is among the leading causes of mortality and morbidity among children in Ghana. Therefore, identifying
the predictors of malaria prevalence in children under-five is among the priorities of the global health agenda. In
Ghana, the paradigm shifts from using traditional statistics to machine learning techniques to identifying predictors
of malaria prevalence are scarce. Thus, the present study used machine learning techniques to identify
variables to build the best fitting predictive model of malaria prevalence in Ghana. We analysed the data on 2867
under-five children with malaria RDT results from the 2019 Ghana Malaria Indicator Survey. LASSO, Ridge, and
Elastic Net regression methods were used to select variables to build predictive models. The R freeware version
4.0.2 was used. One out of four children tested positive for malaria (25.04%). The logit models based on selected
features by LASSO, Ridge, and Elastic Net contained eleven, fifteen, and thirteen features, respectively. The
LASSO regression model is preferred because it contains the smallest number of predictors and the smallest
prediction error. The significant predictors of malaria among children were being older than 24 months, residing
in the poorest household, being severely anaemic, residing in households without electricity, and residing in a
rural area. The predictors identified in our study deserve policy attention and interventions to strengthen malaria
control efforts in Ghana. The machine learning techniques employed in our study, especially the LASSO
regression technique could be beneficial for identifying predictors of malaria prevalence in this group of children