children
Article
Performance of Machine Learning Classifiers in Classifying
Stunting among Under-Five Children in Zambia
Obvious Nchimunya Chilyabanyama 1,2,* , Roma Chilengi 2, Michelo Simuyandi 2, Caroline C. Chisenga 2,
Masuzyo Chirwa 2 , Kalongo Hamusonde 2, Rakesh Kumar Saroj 3 , Najeeha Talat Iqbal 4 ,
Innocent Ngaruye 5 and Samuel Bosomprah 2,6
1 African Centre of Excellence in Data Science, College of Business Studies Kigali, University of Rwanda,
Gikondo—Street, KK 737, Kigali P.O. Box 4285, Rwanda
2 Enteric Disease and Vaccines Research Unit, Centre for Infectious Disease Research in Zambia,
Lusaka P.O. Box 34681, Zambia; roma.chilengi@cidrz.org (R.C.); michelo.simuyandi@cidrz.org (M.S.);
caroline.chisenga@cidrz.org (C.C.C.); masuzyo.chirwa@cidrz.org (M.C.);
kalongo.hamunsonde@cidrz.org (K.H.); samuel.bosomprah@cidrz.org (S.B.)
3 Department of Community Medicine, Sikkim Manipal Institute of Medical Sciences (SIMMS) Sikkim Manipal
University, Gangtok 03592, India; rakesh.saroj@bhu.ac.in
4 Department of Paediatrics and Child Health, Biological and Biomedical Sciences,
Aga Khan University Hospital, Karachi 74800, Pakistan; najeeha.iqbal@aku.edu
5 College of Science of Technology, University of Rwanda, KN 7 Ave, Kigali P.O. Box 4285, Rwanda;
igaruye@gmail.com
6 Department of Biostatistics, School of Public Health, University of Ghana, Accra P.O. Box LG13, Ghana
* Correspondence: chilyabanyama@gmail.com
Abstract: Stunting is a global public health issue. We sought to train and evaluate machine learning
Citation: Chilyabanyama, O.N.;
(ML) classification algorithms on the Zambia Demographic Health Survey (ZDHS) dataset to predict
Chilengi, R.; Simuyandi, M.;
stunting among children under the age of five in Zambia. We applied Logistic regression (LR),
Chisenga, C.C.; Chirwa, M.;
Hamusonde, K.; Saroj, R.K.; Iqbal, Random Forest (RF), SV classification (SVC), XG Boost (XgB) and Naïve Bayes (NB) algorithms
N.T.; Ngaruye, I.; Bosomprah, S. to predict the probability of stunting among children under five years of age, on the 2018 ZDHS
Performance of Machine Learning dataset. We calibrated predicted probabilities and plotted the calibration curves to compare model
Classifiers in Classifying Stunting performance. We computed accuracy, recall, precision and F1 for each machine learning algorithm.
among Under-Five Children in About 2327 (34.2%) children were stunted. Thirteen of fifty-eight features were selected for inclusion
Zambia. Children 2022, 9, 1082. in the model using random forest. Calibrating the predicted probabilities improved the performance
https://doi.org/10.3390/ of machine learning algorithms when evaluated using calibration curves. RF was the most accurate
children9071082 algorithm, with an accuracy score of 79% in the testing and 61.6% in the training data while Naïve
Academic Editor: Tonia Vassilakou Bayesian was the worst performing algorithm for predicting stunting among children under five in
Zambia using the 2018 ZDHS dataset. ML models aids quick diagnosis of stunting and the timely
Received: 19 May 2022 development of interventions aimed at preventing stunting.
Accepted: 24 June 2022
Published: 20 July 2022
Keywords: stunting; machine learning; random forest; Naïve Bayesian; ZDHS
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations. 1. Introduction
Stunting is still one of the most serious health and welfare issues globally. In 2019,
about 21.3% of children under five years of age were estimated to be stunted globally,
Copyright: © 2022 by the authors. and two out of five stunted children live in Africa [1]. Between 2000 and 2020, the global
Licensee MDPI, Basel, Switzerland. prevalence of stunting fell from about 30.3% to 22% [2]. Despite the fall in the magnitude
This article is an open access article of stunting, the prevalence of stunting has remained high in the sub-Saharan region [3]. In
distributed under the terms and Zambia, the prevalence of stunting among children under five was estimated to be 35% in
conditions of the Creative Commons 2019 [4]. Prevalence of stunting is classified as very low, low, medium, high and very high
Attribution (CC BY) license (https:// if it is <2.5%, between 2.5 and 10%, between 10 and 20%, between 20 and 30% and above
creativecommons.org/licenses/by/ 30%, respectively [5].
4.0/).
Children 2022, 9, 1082. https://doi.org/10.3390/children9071082 https://www.mdpi.com/journal/children
Children 2022, 9, 1082 2 of 11
Stunting among children is associated with short-term and long-term health and
social outcomes. Mortality and morbidity are among the most common short-term effects
of stunting [6,7]. Some long-term effects of childhood stunting include poor cognitive
development, poor school performance, delay in motor development and poor maternal
health outcomes [8–10].
Over the years, classical statistical models have been used to identify factors that
are independently associated with stunting among children under five [11–13]. However,
these methods tend not to be robust in situations where the number of covariates is more
than observations and when there is multi-correlation among variables. Furthermore,
they follow strict assumptions about the data and the data generating process, such as the
distribution of errors and additivity of parameters with linear predictors, which may not
hold in real-life [14]. Compared to classical models, machine learning models overcome
the analytical challenges of a large number of covariates and multicollinearity, require
fewer assumptions, incorporate high dimensional data and thus produce a more flexible
relationship between predictor and outcome variables [15]. These methods have been
applied in predicting malnutrition using different datasets [16–19]. Furthermore, machine
learning methods have been shown to be superior to classical statistical methods when
solving classification problems [20].
In this study, we aimed to train, evaluate and select the best machine learning classifier
for predicting stunting among children under five years in Zambia and identifying impor-
tant variables in the prediction of stunting using the 2018 Zambia Health Demographic
survey dataset. This model would serve as the basis for developing an intelligent model for
diagnosing or predicting stunting, and features identified as important predictors of stunt-
ing would serve as variables to target when designing interventions aimed at preventing
stunting among children under five in Zambia.
2. Materials and Methods
2.1. Data Source and Research Workflow
We utilized nutrition data from the 2018 Zambian demographic health survey (ZDHS)
conducted by the Zambian Statistical Agency in collaboration with USAID. The survey
was conducted such that it is representative of the Zambian population. It employed a
stratified two-stage sampling design. The strata were defined by province and residence
(i.e., rural-urban)—there are 10 provinces in Zambia, giving a total of 20 strata. The first
stage involved selecting clusters defined as Enumeration Areas (EAs). For each stratum,
EAs were selected using a probability proportional to size algorithm. In the second stage,
a fixed number of households were selected from each EA using a systematic sampling
technique. Details of the sampling methods are described in [4].
The ZDHS was performed per the Declaration of Helsinki and approved by an ap-
propriate ethics committee. Ethical clearance was obtained from the Ethical Review Com-
mittee of the Ministry of Health, the University of Zambia Biomedical Ethics Committee
and the Tropical Disease Research Centre Ethics Committee. The 2018 ZDHS survey
was approved on the 27th of March 2018 by the TDRC ethics committee under pro-
tocol number STC/2018/6 and by the IRB on 6th March 2018 under protocol number
132989.0.000.ZM.DHS.02. Informed consent was obtained from participants before data
collection. Permission was sort and granted on 22 March 2021 from the DHS program
to use this dataset for research, and the dataset was accessed through IPUMS [21]. All
data were anonymized before the authors received the data. All methods were performed
following the relevant guidelines and regulations.
Figure 1 below depicts the research workflow. The pre-processing was followed by
the feature selection, which led to a 30:70 split in the decision. In the 70% of the dataset
(training dataset), model training was conducted, and then a model was selected and
performance evaluated, while in the 30% of the data (testing dataset), the predictive models
were validated and model performance compared to predict stunting.
CChhilidldrreenn 22002222, ,99, ,x1 F08O2R PEER REVIEW 3 3ofo f1191  
 
FFigiguurree 11. .WWoorrkkffllooww cchhaarrtt. .
2.2. Pre-Processing
2.2. Pre-Processing 
In this study, the target feature was stunting, which was defined based on the WHO
standIna rtdh,iso sfthuedigyh, tth-feo rt-aarggeetZ f-esactourree( HwAasZ s)tu<n−ti2ngst, awndhiacrhd wdeavs idateifoinnsed(S bDa)se[5d] .oTnh tehem WotHheOrs ’
sstoancidoaercdon, oofm hieci,gdhetm-foorg-raagpeh Zic-scchoarrea c(tHerAisZti)c <s, −a2n sdtafeneddairndg dpervaciatitcioesnsw (eSrDe )s e[5le]c. tTehdea ms foetahteurrse’s 
sforcoimoecthoneoZmDiHc, Sdedmatoagbraaspeh. iMc cishsairnagctienrsistatinccs,e sanwde rfeeeddrinopg pperdacftriocems wtheerea nsaellyecstise.d Faus rftehae-r,
tcuornesti nfruoomu sthvaer ZiaDblHesSw dearteabstaasned. aMrdisizsiendgt oinastsatanncdesa rwdenroer mdraolpdpisetdri bfruotmio nt.hTe haennawlyesiasp. pFulier-d
tohredr,i ncaolnatinnduoonues -vhaortiaebnlceosd winegreto stoarnddinaardl aiznedd ntoo na- ostradnindaalrcda nteogromriacla dl ifsetartiubruetsio, rne.s Tpheectni vweley .
applied ordinal and one-hot encoding to ordinal and non-ordinal categorical features, re-
s2p.e3c. tFiveaetluyr. e Selection
Random Forest (RF) feature selection was used to select important features. Tulukdar
2(.230. 2F0e)atruerceo Smemlecetniodns the use of RF feature selection when building a predictive model for
malnRuatnridtioomn  [F1o9r]e. sTt h(Re Fm) ofedaetluarses sigenlescatinonim wpaosr utasnecde tsoc soerleetcot iemacphofretantut rfe,aatunrdefse. aTtulrueksdthaar t
(h20a2d0a) nreicmopmomrteandces sthcoer uesle sosf tRhFan fetahteuarev esrealegcetiomnp worhtean cbeusicldorinegw ae rperendoitcitnivcelu mdeodeinl ftohre 
mmaolnduelt.riFtiognu r[e129]b. eTlohwe mshoodwels aesascihgnfesa atunr eimapnodrittasnacses oscioartee dtoi meapcohr tfaenactuersec,o aren.d features 
that had an importance score less than the average importance score were not included in 
t2h.e4 .mMooddeell. TFriagiunrine g2 below shows each feature and its associated importance score. 
We split the data into 70% training and 30% testing dataset. We evaluated five widely
used machine learning classifiers, namely: Logistic regression (LR), Random Forest (RF),
Naïve Bayesian (NB), Support Vector Machine (SVM) and eXtreme Gradient Boosting (Xg
boost), implemented in scikit learn [22], to predict the probability of stunting.
2.4.1. Logistic Regression
Logistic regression is a supervised machine learning algorithm used to solve classi-
fication problems [23]. It is a parametric method that assumes a Bernoulli distribution of
the target variable and the independence of the observations [24]. Logistic regression is a
common regression model used to predict class membership probabilities and is defined as:
π
log it(Yi = 1|XT) = log i− = β0 + βX
T
( ∣ 1 πi
where πi = pr Yi = 1∣XTi ) is the conditional probability of an observation being in class 1
given the covariates X, β0 is the intercept, and β is the vector of regression coefficients. The
logistic regression model can be fitted using the maximum likelihood method.
 
Children 2022, 9, x FOR PEER REVIEW 4 of 19 
C hildren 2022, 9, 1082 4 of 11
 
FFiigguurree 22.. FFeeaattuurree iimppoorrttaannccee ssccoorree.. 
2.4.2. Random Forest
2.4. Model Training 
Random forest (RF) is an ensemble method consisting of a collection of tree-based
structWuree dspclliats tshifie edrast[a1 i7n]t.oR 7F0i%s  utrsaeidnifnogr calnadss 3ifi0%ca ttieosnti,nrgeg draetsassioent. aWned edviamluenatseiodn firveed wucitdioenly. 
Iut sisedef mficaiecnhtineev elenairnniinnsgt acnlacsessifwiehrse,r enathmereelya:r Leomgoisrteicv raergiarbelsessiothna (nLRob),s eRravnadtioomns .FToorecslta s(RsiFfy),, 
RNFaïbvuei lBdasymesaianny (NdeBc)i,s Siounpptroerets V; eecatcohr Mtreaechminaek e(SsVitMs )i nadnedp eeXntdreemntec Glarsasdifiiecnatt iBoono.stTinhge (RXFg 
cbhoooosste),s itmheplcelmasesnttheadt ihna ssctihkeitm leoasrtnv [o2t2e]s, .to predict the probability of stunting. 
22..44..31.. NLoagïvisetBica Ryeesgiraenss(iNonB )
NLoaïgvisetBica ryeegsrieasnsiiosna icso al lseuctpieornvoisfemd amcahcinheinlee aleranrinnigncgl aaslgsiofircitahtimon uaselgdo troit shomlvseb culailstsoifni-
tchaetiBoany persotbhleeomresm [2.3T].h Iets eis aalg poarritahmmestraicr embeutihltodon thtwato amssauimn eass sau Bmeprntiounlsl;i tdhiestfirirbsut tisiotnh aotf 
etvher tyarpgaeitr voafrfieaabtlue raensdb ethineg incdlaespseifineddeniscein odfe tphen odbesnetrvoafttihoenso t[h24e]r., Lanogdisthtiec sregcorensdsiiosnt hisa ta 
ecaocmhmoank erseagnreisnsdioenp emndodenetl aunsdede qtou aplrceodnictrt icbluatsiso nmteomthberosuhticpo pmreo.bTahboiluitgiehs saimndp lies, dthefeinNeBd 
haass: high functionality [25,26]. For a binary outcome, a Bernoulli Naïve Bayesian algorithm
is appropriate. NB formula is given as:
<!-- MathType@Translator@5@5@MathML2 (no 
namespace).tdPl(@XM|ya)tPhM(yL)  2.0 (no namespace)@ --> 
P(y|X) =  
<math> P(X)
where X is the independent predic t<osresmanandtPic(sX> ) is the predictors’ prior probability, also
referred to as evidence. P(y|X) is the probability of label y given predictors X. This is also
 
Children 2022, 9, 1082 5 of 11
referred to as the posterior probability, and P(y) is referred to as the probability before
evidence is seen or the prior. P(X|y) is known as the likelihood.
2.4.4. Support Vector Machine
A support vector machine is a supervised machine learning algorithm whose goal
is identifying a reproducible hyperplane of n-dimensions that maximizes the distance
between support vectors of two class labels. SVM models are effective when there are more
variables than samples and still effective when the sample size is small. Although it is
memory efficient, SVM models do not provide probability estimates directly but through
an expensive five-fold cross-validation process [27,28]
2.4.5. XG Boost
XG boost, also known as eXtreme Gradient Boosting, is a decision tree-based ensemble
machine learning algorithm that uses a gradient boosting framework (Friedman et al.
(2000)) [29]. Boosting involves combining weak classifiers to produce a powerful averaged
classifier, and it is also a variance reduction technique. It can be applied to both classification
and prediction problems. The boosted decision trees are designed for optimal speed and
improved model performance [30,31].
2.5. Model Performance Evaluation
We plotted the reliability graphs to evaluate the performance of each model. We later
calibrated the predicted probabilities to reflect the occurrence of stunting in the data using
the isotopic regression. The major strength of isotopic regression as a calibration method is
that it can correct any monotonic distortion [32]. We determined the optimal probability
threshold, on the calibrated probabilities, for classifying an instance as stunted or not.
We used 3-fold cross-validation on the training set, and the performance was estimated
on the testing set. Models were evaluated based on the F1 score, Cohen’s kappa, the area
under the precision–recall curve (AUC-PR) and the sensitivity and specificity of each model.
Data analysis was conducted using Python version 3.10.2 [33]. Data were summarized
using proportion and a chi-squared test of independence to test for any association between
stunting. Statistical analysis was set at a p-value < 0.05.
F1 score is the harmonic mean of the precision and recall of the model is calculated
using the formula below:
2 precison X recall tp
F1 = 1 1 = 2 X =X precision + recall 1recall precision tp + 2 ( f p + f n)
where recall, also known as sensitivity, is the proportion classified as positive among all the
positive instants in the dataset. Precision is the proportion of true positive instances among
the instances that the model has predicted as positive. True positive (tp) is the number
of positive instances that are classified as positive by the model. False-positive (fp) is the
number of negative instances that are classified as negative by the model.
Cohen’s kappa score is a metric used to measure inter-rater agreement. Cohen kappa
takes into account agreement that may exist between two measures due to chance, and this
is one of the reasons that makes it a robust measure for evaluating classification models.
3. Results
3.1. Characteristics of Participants
In our dataset, there were a total of 6799 children under five years. Of these, 3421 (50.3%)
were male, and 5253 (77.3 %) were aged 12 months and above (Table 1). A child’s age,
wealth index, region and gender were associated with stunting. About 37.6% of children
aged between 12 months and 59 months were stunted compared to 22.6% of the aged less
than 12 months. A total of 38.2% of children born to a mother without any formal education
Children 2022, 9, 1082 6 of 11
were stunted. Most of the children were from poor families (48%), and 38.3% of these were
stunted. The prevalence of stunting was 34.2%.
Table 1. Percent of stunted children by background characteristics.
Characteristics Number of Children (% of Total) Stunted, n (%) p-Value
Child’s age
12 months 1546 (22.7) 350 (22.6)
>12 months 5253 (77.3) 1977 (37.6) <0.001
Gender
Male 3421 (50.3) 1075 (31.4)
Female 3378 (49.7) 1252 (37.1) <0.001
Mother’s Age (years)
15–24 1938 (28.5) 697 (36)
25–34 3181 (46.8) 1084 (34.1) 0.088
35–49 1680 (24.7) 546 (32.5)
Region
Urban 4875 (71.7) 1734 (35.6)
Rural 1924 (28.3) 593 (30.8) <0.001
Mother’s Education
No formal education 738 (10.9) 282 (38.2)
Primary 3722 (54.7) 1390 (37.3)
Secondary 2045 (30.1) 615 (30.1) <0.001
Higher 294 (4.3) 40 (13.6)
Mothers Current
work
No 3505 (51.6) 1172 (33.4)
Yes 3294 (48.4) 1155 (35.1) 0.158
Wealth Index
Poor 3261 (48) 1250 (38.3)
Middle 1308 (19.2) 453 (34.6) <0.001
Richer 2230 (32.8) 624 (28)
Religion
Muslim 39 (0.6) 17 (43.6)
Catholic 1093 (16.1) 399 (36.5)
Protestant 5603 (82.4) 1891 (33.7) 0.183
Other 64 (0.9) 20 (31.3)
Toilet type
Unhygienic 4265 (62.7) 1428 (33.5)
Hygienic 2534 (37.3) 899 (35.5) 0.094
Total 6799 (100) 2327 (34.2)
3.2. Features Selected Using Random Forest Feature Selection
Figure 2 shows the importance score for each feature in the dataset. The child’s age in
months had the highest importance score, while having an improved toilet facility had the
least importance score. Only 13 out of 58 features were included. Child’s gender, mothers
age, age of household head, child’s age (months), number of sleeping rooms in the house-
hold, number of women between 15 and 49 in the household, the birth interval in months,
shared toilet, mother’s current employment status, years of education, number of children
under five in the household and the total number of children ever born form a mother,
were features that were selected as predictors of stunting among children under five.
Children 2022, 9, 1082 7 of 11
Figure 2 shows the feature importance score for each feature included in the analysis
Children 2022, 9, x FOR PEER REVIEWd ataset in descending order. 14 of 19  
3.3. Comparison of Efficiency of Machine Learning Algorithm
Figure 3  shows  tthe  calliibration curves for  each  machine llearning algorithm with the 
average  prediictted  probabiilliittyy fforr eacch  biin  on  the  x-axiis  and  the  fraction  of positive classes 
in each  bin  on  the  y-axiis,, and  bellow  tthe  countt off possiittiivess iin  eeacch  biin.. 
 
FFiiggurree 33.. Caalliibbrraattiioonn ccurrvvee ffoorr ML aallggoorriitthhmss.. 
Allll fifivvee moodeellss sshhooweed diivveerrggeennccee ffrroom tthhee peerrffeeccttllyy ccaalliibbrraatteed lliinnee ((FFiiggurree 22)).. Thhiiss 
iimpplliieess tthhee nneeeedd ttoo ccaalliibbrraattee tthhee pprreeddiicctteedd pprroobbaabbiilliittyy ddiissttrriibbuuttiioonn.. SSV Cllaassssiifificcaattiioonn waass 
tthhee woorrsstt--ppeerrffoorrmiinngg iinn aallll bbiinnss ccoomppaarreedd ttoo ootthheerr mooddeellss.. Afftteerr ccaalliibbrraattiinngg tthhee pprreeddiicctteedd 
pprroobbaabbiilliittiieess,, aallll mooddeellss iimpprroovveedd aanndd weerree bbeetttteerr aalliiggnneedd wiitthh tthhee ppeerrffeecctt ccaalliibbrraattiioonn 
ccuurrvvee,, eexxcceeppttf foorrt htheeN NaaïvïveeB Bayaeysesm modoedle. lT. hTehem modoedlewl awsadsr doproppepdebde cbaeucsaeuistew ita ws iansa icncaucrcaute-
eravteen eavfetenr acfatelirb craatliibonra(tFioignu (rFeig3u).re 3). 
WWee pprreesseenntteedd tthhee pprreeddiiccttiivvee ppeerrffoorrmmaannccee ooff eeaacchh ccllaassssiififieerr oonn tthhee ttrraaiinniinngg aanndd tteesstt 
ddaattaasseett ((TTaabblele2 2).).L Looggisitsitcicr ergergersessiosinonw awsatsh ethleea lsetaasct caucrcauteramteo mdeoldbeolt hbootnh tohne  ttrhaein tirnaginainndg 
tahnedt ethsted taestat sdeat,tawsietth, wanitahc acunr aacccyuorfac4y4 .o7%f 4a4n.7d%4 5a.n9d% 4, 5re.9s%pe, crteivspeleyc,twivheelyre, awshRearneadso mRaFnodroemst  
MFoordesetl Mwaosdseul pweariso sr,uwpeitrhioar,t rwaiinthin ag taranidnitnesgt ianngda tcecsutrinacgy aocfcu79ra.2c%y oafn 7d96.21%.6 %an, rde s6p1e.6c%tiv, erley-.
spectively. Random Forest Model had the large F1 score in training and the least in the 
testing dataset, while Logistic regression had the least F1 score in the training dataset; it 
had the highest score in the testing data. Logistic regression had a Cohen’s kappa score of 
 
Children 2022, 9, 1082 8 of 11
Random Forest Model had the large F1 score in training and the least in the testing dataset,
while Logistic regression had the least F1 score in the training dataset; it had the highest
score in the testing data. Logistic regression had a Cohen’s kappa score of 0.07 and 0.08 in
the training and testing dataset, respectively. Random forest had a Cohen’s kappa score of
0.55, the highest in the training dataset, and 0.178, the second largest in the testing dataset
(Table 2).
Table 2. Accuracy score for each classification model in predicting stunting.
Model Train F1 Test F1 Train Cohen’s Test Cohen’s Train Test Train TestKappa Kappa PR-AUC PR-AUC Accuracy Accuracy
Logistic
Regression 0.5298 0.5411 0.0797 0.0833 0.3728 0.3858 0.4471 0.4592
Random Forest 0.717 0.4826 0.5535 0.178 0.5992 0.4134 0.7921 0.6162
SV
Classification 0.6083 0.523 0.3106 0.1486 0.4611 0.4051 0.6402 0.5583
XG Boost 0.6006 0.5381 0.3019 0.188 0.4566 0.4192 0.6385 0.5851
We showed the average precision and recall for each machine learning classifier
(Table 3). XB boost had precision and recall of 60%, and logistic regression had a precision
of 58% and a recall of 55%. Random forest and CV classification had 58% and 59% precision
and recall.
Table 3. Precision and recall for each machine learning algorithm.
Model Precision Precision Average Recall Recall AverageNegative Positive Precision Negative Positive Recall
Logistic Regression 0.78 0.39 0.585 0.22 0.89 0.555
Random Forest 0.71 0.47 0.59 0.68 0.5 0.59
SV Classification 0.73 0.43 0.58 0.49 0.67 0.58
XG Boost 0.75 0.45 0.6 0.54 0.67 0.605
4. Discussion
In our study, we identified the random forest model as the model with the highest
predictive accuracy for stunting among children under five years in Zambia using the ZDHS
2018 data. Despite Random Forest and XG Boost performing better than the traditional
logistic regression, logistic regression still retains interpretability as the main advantage
it has over the other ML algorithm. Similar studies used ML algorithms to predict the
nutritional status of children using demographic health survey data [18–20,32]. Our results
are similar to the findings of [19], which implicate the RF algorithm to be a superior
predictor of stunting.
Further, we identified features that are important in predicting stunting among chil-
dren under five in Zambia. Some of the identified features, such as Mother’s education,
mother’s age, age of the child, family size and residence, were identified commonly as pre-
dictors of stunting [11,16,34,35]. We suggest the need to collect more features for predicting
stunting as opposed to keeping only those variables collected in the ZDHS. Though power-
ful, the ML models have a limitation in that they do not come with odds ratios or coefficients
to indicate the direction of the relationship of the important features. Knowing the direction
of the association of each importance would enhance the design and implementation of
interventions aimed at preventing stunting among children under 5 years.
Despite being a widely used measure for stunting, HAZ (Z-score < −2) is an arbitrary
cut-off for stunting, which may have little clinical significance [36]. Our study presents an
opportunity to look at individualized risk assessment for stunting among children. Further,
our study took into consideration key social, economic and environmental factors that may
be key determinants of a child’s nutrition status compared to HAZ, which only takes into
Children 2022, 9, 1082 9 of 11
consideration the age, height and weight of children, which is one major strength of the
ML algorithm in predicting health outcomes [37].
The major strength of this study is using the ZDHS dataset; the ZDHS applies a
sampling method that is robust and, as such, is representative of the under-five population
of children from both rural and urban parts of Zambia. Despite applying an optimized
machine classification model, the study only applied five mostly used algorithms, yet more
algorithms such as the ones used in [38]. Machine learning algorithms use features based
on their importance or contribution to the model and not necessarily causal effect.
We recommend further research or developing a prognostic model for stunting using
longitudinal data. This would help in the development of timely interventions aimed at
preventing stunting. Although the ZDHS dataset is very representative of the Zambian
population of children under five, it may not contain features that would be very instru-
mental in building a predictive model for stunting; this is because data are mainly not
collected for the sole purpose of a study of this nature. ML learning aids timely diagnosis
of stunting and timely design, evaluation and deployment of mitigation measures. Since
the effects of stunting are unreversible in the long run, having a machine learning-based
risk score would aid the treatment of malnutrition.
5. Conclusions
The results suggest that the Random Forest machine learning algorithm has the highest
predictive accuracy for stunting compared to other models applied in this study. We also
identified the children’s and their mothers’ social and economic features that are important
predictors of stunting among children under five.
Author Contributions: Conceptualization, O.N.C. and S.B.; methodology, O.N.C.; software, O.N.C.;
validation, O.N.C., S.B., N.T.I. and I.N.; formal analysis, O.N.C.; data curation, O.N.C. and K.H.;
writing—original draft preparation, O.N.C.; writing—review and editing, O.N.C., S.B., R.C., C.C.C.,
M.S., M.C., K.H., R.K.S. and N.T.I.; supervision, N.T.I., I.N. and S.B. All authors have read and agreed
to the published version of the manuscript.
Funding: This research received no external funding.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data used for this study can be accessed with permission from
IPUMS DHS https://www.idhsdata.org/idhs/ (accessed on 22 March 2021).
Acknowledgments: The Enteric Diseases and Vaccines Research Unit (EDVRU) of the Centre for
Infectious Disease research in Zambia for the support rendered towards the completion of this study.
The World bank Africa Center of excellence in Data Science administration at the University of
Rwanda for facilitating the smooth completion of this study.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. World Health Organization. Levels and Trends in Child Malnutrition: Geneva, 2021; UNICEF: New York, NY, USA, 2021;
ISBN 9789240025257.
2. World Health Organization. Levels and Trends in Child Malnutrition: Geneva, 2020; UNICEF: New York, NY, USA, 2020;
ISBN 0259238X.
3. Quamme, S.H.; Iversen, P.O. Prevalence of child stunting in Sub-Saharan Africa and its risk factors. Clin. Nutr. Open Sci. 2022, 42,
49–61. [CrossRef]
4. Zambia Statistics Agency; Ministry of Health (MOH) [Zambia]. Zambia Demographic and Health Survey 2018; University Teaching
Hospital Virology Laboratory: Lusaka, Zambia; ICF: Rockville, ML, USA, 2019.
5. De Onis, M.; Borghi, E.; Arimond, M.; Webb, P.; Croft, T.; Saha, K.; De-Regil, L.M.; Thuita, F.; Heidkamp, R.; Krasevec, J.; et al.
Prevalence thresholds for wasting, overweight and stunting in children under 5 years. Public Health Nutr. 2019, 22, 175–179.
[CrossRef]
6. Markowitz, D.L.; Cosminsky, S. Overweight and stunting in migrant Hispanic children in the USA. Econ. Hum. Biol. 2005, 3,
215–240. [CrossRef]
Children 2022, 9, 1082 10 of 11
7. Fanzo, J.; Hawkes, C.; Udomkesmalee, E.; Afshin, A.; Allemandi, L.; Assery, O.; Baker, P.; Battersby, J.; Bhutta, Z.; Chen, K. Global
Nutrition Report: Shining a Light to Spur Action on Nutrition; Development Initiatives Poverty Research Ltd.: Bristal, UK, 2018;
ISBN 9780896295643.
8. Myatt, M.; Khara, T.; Schoenbuchner, S.; Pietzsch, S.; Dolan, C.; Lelijveld, N. Children who are both wasted and stunted are also
underweight and have a high risk of death: A descriptive epidemiology of multiple anthropometric deficits using data from
51 countries. Arch. Public Health 2018, 76, 28. [CrossRef]
9. Ong, K.K.; Hardy, R.; Shah, I.; Kuh, D. Childhood stunting and mortality between 36 and 64 years: The british 1946 birth cohort
study. J. Clin. Endocrinol. Metab. 2013, 98, 2070–2077. [CrossRef]
10. Dewey, K.G.; Begum, K. Long-term consequences of stunting in early life. Matern. Child Nutr. 2011, 7, 5–18. [CrossRef]
11. Mzumara, B.; Bwembya, P.; Halwiindi, H.; Mugode, R.; Banda, J. Factors associated with stunting among children below five
years of age in Zambia: Evidence from the 2014 Zambia demographic and health survey. BMC Nutr. 2018, 4, 51. [CrossRef]
12. Rakotomanana, H.; Gates, G.E.; Hildebrand, D.; Stoecker, B.J. Determinants of stunting in children under 5 years in Madagascar.
Matern. Child Nutr. 2017, 13, e12409. [CrossRef]
13. Das, S.; Gulshan, J. Different forms of malnutrition among under five children in Bangladesh: A cross sectional study on
prevalence and determinants. BMC Nutr. 2017, 3, 1. [CrossRef]
14. Rajula, H.S.R.; Verlato, G.; Manchia, M.; Antonucci, N.; Fanos, V. Comparison of conventional statistical methods with machine
learning in medicine: Diagnosis, drug development, and treatment. Medicina 2020, 56, 455. [CrossRef]
15. Iniesta, R.; Stahl, D.; McGuffin, P. Machine learning, statistical learning and the future of biological research in psychiatry. Psychol.
Med. 2016, 46, 2455–2465. [CrossRef]
16. Shahriar, M.; Iqubal, M.S.; Mitra, S.; Das, A.K. A deep learning approach to predict malnutrition status of 0-59 month’s older
children in Bangladesh. In Proceedings of the 2019 IEEE International Conference on Industry 4.0, Artificial Intelligence, and
Communications Technology (IAICT), Bali, Indonesia, 1–3 July 2019; pp. 145–149. [CrossRef]
17. Jin, Z.; Shang, J.; Zhu, Q.; Ling, C.; Xie, W.; Qiang, B. RFRSF: Employee Turnover Prediction Based on Random Forests and
Survival Analysis. In Web Information Systems Engineering—WISE 2020; Lecture Notes in Computer Science; Springer: Cham,
Switzerland, 2020; 12343 LNCS; pp. 503–515. [CrossRef]
18. Markos, Z.; Doyore, F.; Yifiru, M.; Haidar, J. Predicting Under Nutrition Status of Under-Five Children Using Data Mining
Techniques: The Case of 2011 Ethiopian Demographic and Health Survey. J. Health Med. Inform. 2014, 5, 1000152. [CrossRef]
19. Talukder, A.; Ahammed, B. Machine learning algorithms for predicting malnutrition among under-five children in Bangladesh.
Nutrition 2020, 78, 110861. [CrossRef]
20. Bitew, F.H.; Sparks, C.S.; Nyarko, S.H. Machine learning algorithms for predicting undernutrition among under-five children in
Ethiopia. Public Health Nutr. 2021, 25, 269–280. [CrossRef]
21. Boyle, E.H.; King, M.; Sobek, M. IPUMS-Demographic and Health Surveys: Version 8 [dataset]; IPUMS: Minneapolis, MN, USA; ICF:
Rockville, ML, USA, 2020. [CrossRef]
22. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.;
Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. Scikit-Learn. Mach. Learn. Python 2011, 12, 282.
23. Lee, W. Python® Machine Learning, 1st ed.; John Wiley & Sons, Inc.: Indianapolis, IN, USA, 2019; ISBN 978-1-119-54563-7.
24. Cox, D.R. The Regression Analysis of Binary Sequences. J. R. Stat. Soc. Ser. B 1958, 20, 215–232. [CrossRef]
25. McCallum, A.; Nigam, K. A comparison of event models for naive bayes text classification. AAAI-98 Workshop Learn. Text Categ.
1998, 752, 41–48.
26. Zhang, D. Bayesian Classification. In Fundamentals of Image Data Mining; Texts in Computer Science; Springer: Cham, Switzerland, 2019.
[CrossRef]
27. Pisner, D.A.; Schnyer, D.M. Support Vector Machine; Elsevier Inc.: London, UK, 2019; ISBN 9780128157398. [CrossRef]
28. Lewes, G.H. Support Vector Machines for Classification. In Efficient Learning Machines: Theories, Concepts, and Applications for
Engineers and System Designers; Awad, M., Khanna, R., Eds.; Springer: Cham, Switzerland, 2015; pp. 39–66. [CrossRef]
29. Friedman, J.; Tibshirani, R.; Hastie, T. Additive logistic regression: A statistical view of boosting (With discussion and a rejoinder
by the authors). Ann. Stat. 2000, 28, 337–407. [CrossRef]
30. Nokeri, T.C. Data Science Solutions with Python; Apress: Pretoria, South Africa, 2022; ISBN 978-1-4842-7761-4.
31. Sheridan, R.P.; Wang, W.M.; Liaw, A.; Ma, J.; Gifford, E.M. Extreme Gradient Boosting as a Method for Quantitative Structure-
Activity Relationships. J. Chem. Inf. Model. 2016, 56, 2353–2360. [CrossRef]
32. Caruana, R.; Niculescu-Mizil, A. Predicting good probabilities with supervised learning. In Proceedings of the 22nd International
Conference on Machine Learning, Bonn, Germany, 7–11 August 2005; pp. 4943–4947. [CrossRef]
33. Python Software Foundation. Python Language Reference. Available online: http://www.python.org (accessed on 23 June 2020).
34. Mediani, H.S. Predictors of Stunting Among Children Under Five Year of Age in Indonesia: A Scoping Review. Glob. J. Health Sci.
2020, 12, 83. [CrossRef]
35. Bwalya, B.B.; Lemba, M.; Mapoma, C.C.; Mutombo, N. Factors Associated with Stunting among Children Aged 6–23 Months
in Zambian: Evidence from the 2007 Zambia Demographic and Health Survey. Int. J. Adv. Nutr. Health Sci. 2015, 3, 116–131.
[CrossRef]
36. Perumal, N.; Bassani, D.G.; Roth, D.E. Use and misuse of stunting as a measure of child health. J. Nutr. 2018, 148, 311–315.
[CrossRef]
Children 2022, 9, 1082 11 of 11
37. Mhasawade, V.; Zhao, Y.; Chunara, R. Machine learning and algorithmic fairness in public and population health. Nat. Mach.
Intell. 2021, 3, 659–666. [CrossRef]
38. Deo, R.C. Machine learning in medicine. Circulation 2015, 132, 1920–1930. [CrossRef]