Quality Engineering
ISSN: 0898-2112 (Print) 1532-4222 (Online) Journal homepage: https://www.tandfonline.com/loi/lqen20
Control charting methods for autocorrelated cyber
vulnerability data
Anthony Afful-Dadzie & Theodore T. Allen
To cite this article: Anthony Afful-Dadzie & Theodore T. Allen (2016) Control charting
methods for autocorrelated cyber vulnerability data, Quality Engineering, 28:3, 313-328, DOI:
10.1080/08982112.2015.1125926
To link to this article:  https://doi.org/10.1080/08982112.2015.1125926
Published online: 31 Mar 2016.
Submit your article to this journal 
Article views: 119
View Crossmark data
Citing articles: 4 View citing articles 
Full Terms & Conditions of access and use can be found at
https://www.tandfonline.com/action/journalInformation?journalCode=lqen20
QUALITY ENGINEERING
, VOL. , NO. , –
http://dx.doi.org/./..
Control charting methods for autocorrelated cyber vulnerability data
Anthony Afful-Dadziea and Theodore T. Allenb
aBusiness School, University of Ghana, Accra, Ghana; bIntegrated Systems Engineering, The Ohio State University, Columbus, Ohio
ABSTRACT KEYWORDS
Control charting cyber vulnerabilities is challenging because the same vulnerabilities can remain from autocorrelation; average run
period to period. Also, hosts (personal computers, servers, printers, etc.) are often scanned infre- length (ARL); control charts;
quently and can be unavailable during scanning. To address these challenges, control charting of EWMA control charts;
the period-to-period demerits per host using a hybridmoving centerline residual-based and adjusted statistical control
demerit (MCRAD) chart is proposed. The intent is to direct limited administrator resources to unusual
cases when automatic patching is insufficient. The proposed chart is shown to offer superior average
run length performance compared with three alternative methods from the literature. The methods
are illustrated using three datasets.
Introduction repaired each month by automatic patching without
Cyber attacks are on the increase and many orga- local intervention. Typically, only a tiny fraction of
nizations are losing substantial amounts of money vulnerabilities are repaired manually because of auto-
as a result. A study of the financial impact, cus- matic patching and limited resources. As a result, it
tomer turnover, and actions taken by 51 compa- may be of interest for administrators to intervene only
nies in the United States concluded that, on aver- when there is something unusual occurring (i.e., an
age, the cost of a successful attack in 2010 increased assignable cause) or, alternatively, amajor threat is clear
to $7.2 million, up 7% from $6.8 million in 2009 (e.g., an on-going attack). Therefore, this article focuses
(Ponemon Institute 2011). Cyber vulnerabilities are on a statistical process control approach designed to
ways that hosts such as personal computers, servers, signal the presence of assignable causes.
and printers can be exploited. Examples of vulnera- Previous authors have developed monitoring tech-
bilities include: weak passwords, weak authentication niques relating to cyber vulnerabilities. Yet, some have
processes, unsupported operating systems, informa- used data that is not available in vulnerability reports.
tion disclosures, and the use of software with known For example, Dowdy (2012) discusses the challenges
exploitable bugs. Reportedly, over 90% of successful in integrating data from many sources to summarize
attacks exploit known vulnerabilities for which a patch risks. Abedin et al. (2006) also use traffic volumes as
exists but has not been applied by the system admin- part of a comprehensive network evaluation approach.
istrators (Legard 2002). Therefore, while new technol- Further, Abedin et al. (2006) introduce exponential
ogy to identify and patch vulnerabilities is important, functions in their formulations which potentially com-
securing and focusing human resources to eliminate plicate the interpretation. Others authors have based
known vulnerabilities is also important. their metrics on forecasted quantities without invoking
The objective of this article is to propose control the concepts from statistical process control (Ahmed
charting methods for cyber vulnerabilities to direct the et al., 2008). In this article, a relatively simple monitor-
attention of system administrators to unusual occur- ing technique based on readily available data and sta-
rences that correspond to assignable causes that they tistical process control is proposed.
can address. As noted in Afful-Dadzie and Allen Cyber vulnerability data are often providedmonthly
(2014), a substantial fraction of vulnerabilities are with reference to the Common Vulnerability Scoring
CONTACT Theodore T. Allen allen.@osu.edu Integrated Systems Engineering, TheOhio State University,  Neil Avenue –  Baker Systems, Columbus,
OH .
Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/lqen.
©  Taylor & Francis
314 A. AFFUL-DADZIE AND T. T. ALLEN
System (CVSS) described in Mell et al. (2007). CVSS advantage that the charted quantity is intuitive, i.e.,
scores range from 0.0, meaning no vulnerability, to it is the demerits per unit (Nembhard and Nemb-
10.0, indicating that the program evaluating the sys- hard 2000). Yet, Runger and Willemain (1995) noted
tem (scanner) is in a position to take over the host sys- the poor average run length performance of residual
tem. Common scanning technology divides vulnera- charts given their diminished capacity to identify shifts
bilities into categories based on the CVSS score: low after the first subgroup following the shift (Runger and
(0.0–3.9), medium (4.0–6.9), high (7.0–9.9), and criti- Willemain 1995). MCD charts are also based on resid-
cal (10.0). A given host could have multiple vulnerabil- uals and can be expected to have similar performance.
ities, e.g., 2 mediums and 1 critical. Therefore, the sit- This deficiency motivates two new charts that are pro-
uation is somewhat analogous to manufacturing with posed in this article. These are “adjusted demerit” (AD)
nonconformity counts of different levels of severity. and hybrid moving centerline residual-based demerit
Weightings of these counts or demerits and the asso- and adjusted demerit (MCRDAD) charts for monitor-
ciated demerit charting techniques are potentially rel- ing cyber vulnerabilities. An average run length (ARL)
evant in this case (e.g., see Nembhard and Nembhard comparison is also described to confirm the benefits of
2000). Also, because of the infrequent (often monthly) the proposed methods.
nature of relevant data, charting without subgroups or The remainder of this article is organized as follows.
“individuals” control charting of demerits is relevant. First, alternative statistical process control (SPC) charts
However, unlike in manufacturing, the hosts (or relevant to cyber vulnerability data are described.
units) with the vulnerabilities (or nonconformities) are Because of the repeat nature of cyber vulnerabilities,
not shipped each period. Instead, these hosts might the focus is on procedures specifically addressing auto-
be personal computers which are used for multiple correlated data. The reviewed procedures includemov-
months and might likely have the same vulnerabili- ing centerline demerit (MCD) charts from Nembhard
ties for an extended period. On-going “patching” elim- and Nembhard (2000) and moving centerline charts
inates a fraction of the vulnerabilities each month, but based on AR(1) residuals. Issues with residual-based
far fewer than 100%. The accumulation of vulnera- charting are used to motivate the proposed adjusted
bilities almost unavoidably induces autocorrelation or demerit (AD) and moving centerline residual-based
correlation in period-to-period nonconformity counts. and adjusted demerit (MCRAD) methods. The aver-
Autocorrelation is amajor issue related to control chart age run lengths (ARLs) of the alternative methods are
performance (Alwan and Roberts 1988; Montgomery then compared. Next, the application of the proposed
and Mastrangelo 1991; Runger and Willemain 1995; methods is illustrated using three cyber vulnerability
Loredo et al. 2002; Nembhard and Nembhard 2000). datasets from different organizations. Finally, conclu-
An additional complication is that local vulnerabil- sions are presented and opportunities for future work
ities are influenced by external causes including con- are described.
tinual discoveries of new vulnerabilities for the soft-
ware in use. These phenomena could cause a constant
increase in vulnerability counts over time onmany sys- Statistical process control charting
tems (Alhazmi and Malaiya 2005). In this section, four alternative methods are described.
Charting based on autoregressive (AR) moving As mentioned previously, the carryover of vulnera-
average modeling promises to eliminate the adverse bilities from one period to the next causes a high
effects of autocorrelation and trending because the degree of autocorrelation in related vulnerability data.
model residuals are generally uncorrelated and de- Therefore, the focus here is on methods specifically
trended (Montgomery and Mastrangelo 1991; Runger addressing autocorrelation rather than general tech-
and Willemain 1995). Perhaps the simplest of the niques such as exponentially weighted moving aver-
relevant schemes is based on the first-order autore- age (EWMA) charts. Also, the charting methods can
gressive or AR(1) model. Authors have noted the be applied both retrospectively as an analysis tech-
ability of such approaches for addressing autocorrela- nique and also built into scanning software for active
tion as well as underlying trends (Runger and Wille- monitoring.
main 1995). Another relevant approach is moving The first alternativemethod explored here ismoving
centerline demerit (MCD) charts which offer the centerline demerit (MCD) charting from Nembhard
QUALITY ENGINEERING 315
and Nembhard (2000). The second is a trivial combi- centerline approach is functionally identical to resid-
nation of the residual charting methods from Runger ual charting in that the charts would deliver the same
and Willemain (1995) and the moving centerline con- out-of-control signals in identical situations and yet the
cept from Nembhard and Nembhard (2000). Next, an charted quantity is the demerits per unit. Second,Nem-
adjusted demerit chart and a hybrid moving centerline bhard and Nembhard (2000) argued that time series
residual-based and adjusted demerit (MCRAD) meth- modeling might be too complicated for many possi-
ods are proposed. The motivations for the proposed ble users and exponentially weighted moving average
methods relate to the objectives of improved average (EWMA) offers similar predictions with only a single
run length performance and interpretability. adjustable parameter, λ. Therefore, they based the cen-
terline (CLi) of theirmoving centerline demerit (MCD)
chart on the following EWMA formula:
Residual-based charts
CLi = ŷi+1 = λyi + (1− λ) ŷi−1 [3]
In general, the residuals of a defensible time series
model are approximately independent, and identically where λ is the weight given to themost recent weighted
distributed from a normal distribution assuming that value and must satisfy 0< λ  1
the process is under control (for example, when there Then, the MCD upper control limit UCLi+1, and
are no shifts). The properties of the residuals can be lower control limits LCLi+1 are:
evaluated using autocorrelation function (ACF) and UCL = ŷ +Mσ̂
partial autocorrelation function (PACF) residual plots. i+1 iLCL = ŷ − [4]Mσ̂
The charting of residuals from time series models i+1 i
such as AR(1) was described in Runger andWillemain where M is a potentially adjustable parameter given
(1995). For the AR(1) processes, the model prediction in Nembhard and Nembhard (2000), and usually M
can be written: = 3.0. The parameter σ̂ is the standard deviation for
the one step ahead prediction errors e = yi − ŷi, which
ŷi = μ+ ϕyi−1 [1] are independent and uncorrelated with mean of zero.
Nembhard and Nembhard (2000) proposed two pro-
and the model residual is simply: cedures for estimating λ and σ̂ . The first of which is
ε̂i = yi − ŷ . [2] used for illustration and involves selecting λ to mini-i
mize the sum of squared residuals and σ̂ as the root
where yi is the dependent variable (or the demerit per mean squared residual.
unit in our model) at time period i, ε̂i is a white noise A trivial variant of the MCD charts is to sim-
with zeromean and constant variance, andμ and−1 < ply base the predictions on the time series model in
ϕ < 1 are constants to be determined. The symbol “^” Eq. [1] instead of the EWMA model in Eq. [3]. This
denotes the estimated or predicted value based on the approach offers the benefit of MCD charts in that the
data, yi for time period i = 1,…, p. In a residual chart, charted quantity is the intuitive demerits per unit. Also,
the charted quantity is ε̂i in (2). the predictions are based on the likely more accurate
Nembhard and Nembhard (2000) examined charts time series models instead of the EWMA model. The
based on residuals in Eq. [2] and proposed two mod- proposed variant is referred to as moving center-line
ifications. First, they argued that charting of residu- residual-based demerit (MCRD) charts. The MCRD is
als is not intuitive for decision-makers in that they are slightly different than a residual chart because unlike
generally more interested in the process mean than the residual chart theMCRDwill adjust the lower limit
the model residuals. Instead of residual charting, they to zero in situations when the calculated lower control
proposed using a moving centerline based on model limit is negative.
predictions and moving limits based on the standard As mentioned previously, MCD and MCRD charts
deviation of the residuals. Their proposed approach are approximately equivalent to residual charts in the
is in accord with the insights in Alwan and Roberts signals generated. Also, Runger and Willemain (1995)
(1988), who had argued that residual charting was documented the average run-length (ARL) properties
insufficient, while offering the simplicity of a single of residuals charts with two notable findings. First,
chart. The Nembhard and Nembhard (2000) moving residual charts offer run-length performance that may
316 A. AFFUL-DADZIE AND T. T. ALLEN
be considered poor based on the tables provided by Then, the center line (CL) of the demerit control
Runger and Willemain (1995) compared with alter- chart is:
natives for cases without autocorrelation and EWMA ∑m
charts. Second, the poor performance relates to the fact CL = wkD̄k. [7]
that residual charts offer a relatively high probability k=1
of generating an out-of-control signal in the first sub- The upper and lower control limits for period i are:
group after a shift. After the first subgroup, the chance
of detecting the shift is greatly diminished. In the next UCLi = CL+Mσ̂i
section, two charting techniques are proposed with the and
objective of offering improved ARL performance com- = [( ) ]pared with MCD and MCRD charting procedures. LCLi max CL−Mσ̂i , 0 [8]
where √∑m
Adjusted demerit charts 2= k=1 wkD̄kσ̂i , [9]
Standard demerit charts are generally considered to ni
be inapplicable to cases involving significant autocor- and where M is a potentially adjustable parameter
relation (e.g., see Montgomery, 2012). These charts which, in standard demerit charts, is 3.0.
are based on the assumption of independently Poisson The standard demerit chart given above is likely to
distributed demerits. While the Poisson distribution foster high false alarm rates if applied to charting cyber
seems approximately appropriate for weighted vulner- vulnerabilities for the following reasons. The estimated
ability counts, the assumption of independence from standard deviation in Eq. [9] is based on the assump-
period to period does not apply because of the signif- tion that the demerit counts of different levels of sever-
icant autocorrelation. In what follows, we first present ity are uncorrelated. For the cyber vulnerabilities in
the standard demerit control chart model, point out its the case studies shown later, at least two counts of vul-
limitations to charting demerits per unit of cyber vul- nerabilities are significantly correlated for all three of
nerability data, and proposed an adjusted demerit con- the organizations considered. In the presentation here,
trol chart for overcoming such limitations. the anonymous organizations are assigned labels corre-
The standard demerit control chart formulas from sponding to their size, so that organization #1 had the
Dodge (1928) are derived as follows. Let di be the most hosts. For example, the correlation between the
weighted total number of demerits in period i, ni be the high and critical counts for organization #1 in Table 1
sample size, and cik, be the number of class k noncon- is 0.97 which is significant with a p-value less than
formities, k = 1, 2, . . . ,m. If wk is the weight of non- 0.001. Also, Dodge and Romig (1928) assumed that the
conformity class k, the weighted demerits di, and the charted quantities (demerits per host), yi, exhibit no
demerit per unit Di (which is the charted quantity and autocorrelation if the system is under statistical con-
referred to in this article as demerit per host) in period trol. As noted in Table 4 later, the autocorrelation coef-
i are: ficients are significant for all three organizations.
∑m It was the violations of assumptions of control charts
di = wkcik that motivated new methods such as those in Runger
k=1 and Willemain (1995) and Nembhard and Nembard
(2000). Runger and Willemain (1995) evaluated resid-
and ual charts and determined that applying individuals
d
D = i . [5] control charts (e.g., see Montgomery, 2012) to batchedi ni
Table . The estimated AR() parameters for the three organiza-
The average number of demerits, D̄k, across all the p
tions (cases).
periods, for nonconformity class k is: Case  Case  Case 
∑p Coefficient (ϕ̂) . . .
= ∑i=1 cik Mean ( μμ̂ =D̄ [6] 1− ) . . .ϕk p . Sigma (σ̂ ) . . .
i=1 ni
QUALITY ENGINEERING 317
observations offered relatively desirable average run A user might seek even greater average run length per-
lengths. Then, instead of charting the demerits per host formance by optimizing simultaneously over M1 and
for each period, yi, one would chart the average of M2. It is also possible, the desired in-control average
m subgroups. Runger and Willemain (1995) recom- run length cannot be attained using the default value
mended batch sizes to reduce the autocorrelations to ofM1 = 3.0. Then, bothM1 andM2 should be adjusted
less than 0.1. For cases such as organization #1 with simultaneously to achieve desired in-control average
autocorrelation coefficients greater than 0.98, the rec- run length with, again, the in control model being the
ommended batch size was m = 58. With each period estimated time series model.
lasting a single month, there would be a single sub-
group every 4.8 years, which is impractical for cyber
vulnerability charting. Comparison of average run lengths
With the goal of providing desirable average run In this section, the four charting procedures are com-
length performance with an intuitive charted quantity, pared using average run lengths (ARLs). While ARL
the following adjusted demerit charting procedure is calculations are skewed by rare long run lengths, we
proposed. include them to provide a direct comparison with pre-
Step 1: Apply time series modeling from Box and Jenk- vious research on charts for autocorrelated data. The
ins (1994) to develop a time series model of the derived ARL values are based on a simulated demerit
demerits per host. For example, Table 1 shows the per host data from an autoregressive model. The four
coefficients for the AR(1) models derived in the case charting procedures to be compared are: moving cen-
studies. terline demerit (MCD) from Nembhard and Nemb-
Step 2: Obtain a value of M in Eq. [7] such that the hard (2000), moving centerline residual-based demerit
average run length (ARL) with the process in con- (MCRD) which is an extension of residual charts from
trol achieves a desired value, e.g., ARL(in control)= Runger and Willemain (1995), adjusted demerit (AD),
200.0. The value ofM can be determined using sim- and moving centerline residual and adjusted demerit
ulation based on the model derived in Step 1. Apply (MCRAD) charts. The ARL values are estimated using
charting to the demerits per host using the derived 20,000 simulations in which the shift (δ) occurs on the
M values. first subgroupwith the initial subgroup being subgroup
zero following the procedure in Runger andWillemain
The above adjusted demerit procedure is facilitated (1995). Therefore, all the ARL estimates have standard
by modern computing. Using this procedure, there is deviations less than 1% (0.007× standard deviation) of
no assumption about the autocorrelation or cross cor- the estimated ARL values making virtually all compar-
relation other than that it can bemodeled appropriately isons significant simultaneously. Therefore also, after
in Step 1. the first subgroup all responses derive from Eq. [10]
with δ (in increment of 0.5) added. In each case, the
simulated demerits per host derived from the standard
Moving centerline residual-based and AR(1) model of the demerits per host (yi) with a singleadjusted demerit charts lag can be written for period i:
An alternative approach is to chart the demerits per y = μ+ ϕy + ε , [10]
host using limits frombothmoving centerline residual- i i−1 i
based (MCR) and adjusted demerit (AD) charts. If where the εi are assumed to be independent identically
the demerits per host cross any of the control limits, distributed (IID) N(0, σ 2). The coefficients, μ and ϕ,
an out-of-control signal is generated by the derived can be estimated through least squares regression using
hybrid moving centerline residual-based and adjusted a lag variable, which is available in standard software
demerit (MCRAD) chart. Let M1 refer to the param- under the time series menus.
eter in Eq. [4] associated with MCD limits and M2 to Table 1 contains the three sets of parameters needed
refer to the parameter in Eq. [7] associated with AD for simulating the demerit per host data. These were
limits. As a default and for simplicity, we set M1 = 3.0 obtained from the three case study datasets described
and then findM2 using a two-step procedure similar to later. The related ARL results are shown in Tables 2–4,
the one for determining adjusted demerit chart limits. where values under M = 3 are presented to show the
318 A. AFFUL-DADZIE AND T. T. ALLEN
Table . Average run length values for an AR() process with estimated parametersϕ= .,μ= ., and σ = . based on data from
Case .
MCD MCRD AD MCRAD
δ/σ M= . M= . M= . M= . M= . M= . M = .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
Table . Average run length values for an AR() process with estimated parameters ϕ = .,μ= ., and σ = . for Case .
MCD MCRD AD MCRAD
δ/σ M= . M= . M= . M= . M= . M= . M = .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
relatively arbitrary performance levels if the standard to directly address the test cases such that its residu-
choices are used. For example, theM= 3 in-control run als are IID N(0, σ 2). Similarly, the ARL for the MCD
lengths for the demerit charts are generally so short that andMCRDexceed that for theADandMCRADcharts.
false alarms wouldmake their application prohibitively The exception is for the largest shifts (δ/σ = 4.0) under
expensive. the assumptions in Table 2. Then, the MCRD chart
For the MCRAD chart, the simulations involve val- offers a lower average run length than theADchart. Yet,
ues ofM1 = 3.0 in Eq. [4] andM2 in Eq. [8] that gener- the MCRAD chart dominates the MCD and MCRD
ate ARL in-control (with no assignable causes active, charts in all cases. Therefore, it is concluded that the
δ/σ = 0.0) values approximately equal to 200. In all use of MCD or MCRD charting in the context of cyber
cases where the ARL in-control value is approximately vulnerabilities is generally inadvisable since AD and
200, the ARL values for the MCD chart exceed that of MCRADmethods offer generally superior ARL perfor-
the MCRD chart. This is explained by the fact that, in mance. This assumes that the ability to perform time
using the same AR(1) internalized within the MCRD series modeling and simulation is within the capabili-
chart, an advantage is conferred to theMCRD chart. In ties of the practitioners. The authors have excel-based
other words, the MCRD charting method is designed software available upon request for generating the M
Table . Average run length values for an AR() process with estimated parameters ϕ = .,μ= ., and σ = . for Case .
MCD MCRD AD MCRAD
δ/σ M= . M= . M= . M= . M= . M= . M = .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
QUALITY ENGINEERING 319
or M2 values required by the AD and MCRAD charts, have vulnerability 23 (out-of-date operating system)
respectively. and vulnerability 35 (weak password) while host 2
Further, the AD chart dominates the MCRAD chart might have vulnerability 35 only. Combining results
for small shifts (δ/σ < 3.0) while the MCRAD dom- from all three of our case studies results in 183 dis-
inates for large shifts (δ/σ > 3.0). This corroborates tinct vulnerabilities.
Runger and Willemain (1995). Residual-based charts Step 2. List the specific hosts known to have each of
offer a relatively high probability of identifying a large the observed vulnerabilities in each month. Table 5
shift on the first subgroup, but adjusted demerit charts shows a portion of this listing. The numbers in the
offer improved detection probabilities in other cases. table are the CVSS scores for the specific vulnera-
The differences are larger for the assumptions inTable 2 bilities. If an item is blank it implies that either the
which is attributed here to the higher degree of auto- host did not have the vulnerability or the host was
correlation. Runger and Willemain (1995) also found unavailable during the scanning period. It was found
larger differences in ARL values among alternative that only 36 of the 498 hosts exhibited any vulnera-
charts when the degree of autocorrelationwas relatively bility during the 28 months. Therefore, the vulner-
high. The MCRAD charts are recommended for cyber abilities were concentrated on approximately 7% of
vulnerability modeling because we feel that the ability the hosts.
to quickly detect large shifts is likely relevant in appli- Step 3. Tabulate the counts of low, medium, high,
cations. Yet, the AD charts also offer relative simplic- and critical vulnerabilities on all hosts for each
ity and competitive ARL performance making them a month. Table 6 shows the counts of vulnerabilities
viable alternative. of different levels of severity for two of the hosts.
The instances in which hosts were unavailable are
marked with borders and bolding.
Case studies Step 4. Impute the missing data using the sample
In this section, the case studies that motivated our averages of the counts from the months before
research are described. The section begins with the and after each instance (possibly including multi-
steps taken to prepare the data for attribute chart- ple month gaps), i.e., mean-based imputation was
ing and the report from the local system administra- applied (Enders 2010). For missing data in the first
tor about assignable causes. Next, the applications of or last months, counts were inserted from the clos-
Box-Jenkin’s time series modeling are then described. est month in time for which there were data. Note
Finally, results illustrate possible insights gained using that such imputations are likely necessary as miss-
the proposed adjusted demerit (AD) and moving cen- ing data in vulnerability datasets is the common
terline residual-based and adjusted demerit (MCRAD) result of hosts being unavailable during the sys-
charting procedures. tem scans. The results are shown in Table 6 in bold
font.
Step 5. Tabulate the total number of sampled hosts (ni)
Vulnerability data preprocessing successfully scanned in each period i and the total
The organizations under study had (altogether) 498 counts (cik) for severity levels k = 1,…4 for low,
hosts over a 28-month period using data from the medium, high, and critical vulnerabilities.
monthly Nessus scans. Nessus is a vulnerability scan- Step 6.Calculate the demerits (Di) per period i using:
ning software developed by Tenable Network Security, ∑4
widely regarded as a world leader in vulnerability scan- Di = wkcik [11]
ning. One of the main challenges during a scan is that, k=1
if a host is turned off or its firewall is turned on, it would
not appear in the final vulnerability report even if it had with weights w1= 2.0, w2= 5.5, w3= 8.5, and w4= 10,
vulnerabilities. for low, medium, high, and critical, respectively. These
The steps to generate attribute data were as follows. weights are determined with reference to the common
vulnerability scoring system (CVSS; Mell et al. 2007).
Step 1. Identify all distinct vulnerabilities across all Also, the demerits per host, yi, were derived using yi =
hosts and all 28 months. For example, host 1 might Di = di/ni. The resulting data are shown in Table 7 for
320 A. AFFUL-DADZIE AND T. T. ALLEN
Table . Excerpt of the data of vulnerabilities and their CVSS scores for a single month.
Host# month Vul  Vul  Vul  Vul  Vul  Vul  Vul  … Vul 
    …
  . …
   . . …
  . . …
  …
  . . …
  …
  …
  …
  …
  …
  …
          
  … .
the largest of the three organizations. Tables A1 and in the Nessus scan with little chance of causing
A2 in the appendix contain similar information for the actual intrusions), and changing host permissions. The
other organizations. actions also included the resolution of three then on-
The choice of mean-based imputation in Step 4 going cyber-attacks. This was accomplished through
was motivated by our inspection of the available data removing host permissions and vulnerable software
for the 36 hosts for which there were vulnerabili- manually. There was no awareness of any actions taken
ties. Table A3 in the appendix provides the CVSS with respect to hosts in organizations #2 and #3 during
score for the most severe vulnerability on each host the 28 months.
by CVSS score. The data indicate a high degree of Despite the actions taken by the administrator to
constancy in the vulnerabilities on the hosts. The 462 patch a select number of hosts, the administrator per-
hosts not shown in Table A3 are believed to have ceived only a single unusual occurrence or assignable
had no known vulnerabilities during the entire 28 cause during the 28 months. The remaining variation
months. was perceived to be typical or associated with com-
The system administrator was also interviewed for mon causes only. The assignable cause occurred dur-
organization #1, which was the largest of the three. ing month 19 and began to affect counts on month
The administrator reported taking 16 manual actions 20. During month 19, there was a major organi-
during the 28 month period following direct requests zational change and the administrator lost respon-
from the host users. These included manually apply- sibility for approximately 200 hosts. This change
ing patches for hosts with automatic patching turned included none of the hosts having vulnerabilities in
off, identifying false positives (vulnerabilities reported Table A4.
Table . Vulnerability counts for two hosts with imputed data bolded.
Host 1 Host 2
Month Low Medium High Critical Low Medium High Critical
        
        
     0 1.5 2 0
 0 1 0 0    
        
        
        
        
        
        
     0 3 2 0
 0 1.5 0 0    
        
        
        
        
        
QUALITY ENGINEERING 321
Table . Tabulation of the hosts scanned successfully and numbers of vulnerabilities of different levels of severity for Case . Also included
are demerit and demerit per host data.
Number of Vulnerabilities
Month ni Low Medium High Critical di yi = Di
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
Time series models and autocorrelation residuals for the case #1 data are shown in Figure 2. The
All of the charting methods consider here involve two- corresponding ACF and PACF plots are excluded for
step approaches. In the first step, a procedure such cases #2 and #3 since they appear similar to those of
as standard time-series modeling (Box and Jenkins case #1.
1994) is applied (MCRD, AD, and MCRAD). If the The relevance of AR(1) models for cyber vulnera-
time-series model provides a good fit, the residuals are bility demerits is likely a general phenomenon because
uncorrelated, approximately normally distributed, and it relates to the carryover of the same vulnerabilities
provide useful inputs for charting. and associated demerits from period to period. With
In this section, the application of time series models the complete data represented by Table 5, it was pos-
to the data from the three organizations (three cases) is sible to identify vulnerabilities that appeared in one
described. By examining the autocorrelation function month but not the next month. Some of the miss-
(ACF) and partial autocorrelation function (PACF) for ing vulnerabilities were presumably the result of the
the demerits per host for the three cases (Table 7; host being turned off or its firewall turned on. Yet, by
Tables A1 and A2), it was determined that AR mod- assuming that all of the appearing and disappearing
els offer an appropriate choice for all 3 cases. Figure 1 vulnerabilities were patched, an upper estimate of the
shows the ACF and PACF for organization #1 (case 1). average patching rate is obtained. For example, for
The results for the 3 cases indicate that AR(1) models organization #1 966 total vulnerabilities (not distinct)
are good fit for the vulnerability data at hand. These were identified over 28 months and a total of 265
choices are confirmed by studying the ACF and PACF instances in which vulnerabilities were present one
plots of the AR(1) model residuals. In all three cases, month and absent the next. Therefore, the upper bound
the residuals show no evidence of autocorrelation and on the average monthly patching rate is determined as
normal probability plots (not shown) indicate approxi- 100% × 265/966 = 27.4%. Table A4 in the Appendix
mate normality (with the exception of the outlier asso- contains the upper bound percentages of vulnerabili-
ciated with an assignable cause described above). The ties that were patched from one month to the next fororganization #1.
322 A. AFFUL-DADZIE AND T. T. ALLEN
Figure . Case #: Demerits per host (a) autocorrelation function and (b) partial autocorrelation function.
Figure . Case #: AR() residuals (a) autocorrelation function and (b) partial autocorrelation function.
The AR model form was given in Eq. [10]. While The application of the proposedmethods
the same AR(1) model form in Eq. [10] applied to all
three cases, the degree of autocorrelation represented Upon consultation with the relevant system admin-
by the coefficient (ϕ) and the mean ( ) and standard istrator and inspection of the data in Table 5, anμ
deviation ( ) varied. Table 1 summarizes the coeffi- assignable cause relating to an unusual shift in respon-σ
cients for the three cases from AR(1) modeling. There- sibility was identified. Through a re-organization,
fore, organization #1 had tens of carry-over vulnerabil- approximately 200 hosts were shifted to a different
ities from period to period (high ) while organization organization in period 20. Therefore, this detection canϕ
#2 had many demerits per host (over 5) overall (high indeed be considered an assignable cause. The system
). Organization #3 had relatively lower carryover and administrator commented that no other occurrencesμ
good quality levels (low and low ). during the 28 months seemed unusual.ϕ μ
Figure . Adjusted demerit (AD) chart for the data from organization #.
QUALITY ENGINEERING 323
Figure . MCRAD chart for the data from organization # withM = . andM = ..
In applying the adjusted demerit (AD) charting, the better able to identify causes in periods following a
derived values ofM for the three organizations and data shift. The MCRAD chart combines the strengths of
sets are M = 20.90, 7.06, and 2.405, respectively. Note residual-based and adjusted demerit charts.
that the value 20.90 is much larger than 3.0 because of Based on data in Table 7 from organization #1, the
the relatively high degree of autocorrelation for orga- value M2 = 22.9 was used to achieve an approximate
nization #1. The adjusted demerit chart for the data in-control average run length equal to 200. The derived
from organization #1 in Table 7 is given in Figure 3. It MCRAD control chart is given in Figure 4. The chart
is noteworthy, perhaps, that the adjusted demerit chart generates the desired signal on subgroup 20 relating to
failed to identify the period 20 shift that both theMCD the assignable shift of 200 hosts that were moved out-
and MCRD charts (not shown) identified. It is conjec- side the relevant organization. The absence of a lower
tured that this failure highlights the relative strength of control limit from the adjusted demerit-related limits is
residual-based charts relating to immediate identifica- due to the value M2 = 22.79. Larger values of M2 may
tion of shifts in the underlying process. Yet, the adjusted generally be expected if the degree of autocorrelation
demerit chart has the potential advantage of being is high. The value ϕ = 0.920 (Table 1) associated with
Figure . MCRAD chart for the data from organization # withM = .,M = ..
324 A. AFFUL-DADZIE AND T. T. ALLEN
Figure . MCRAD chart for the data from organization # withM = . andM = ..
the organization #1 dataset indicates a relatively high Application of two residual-based methods taken
carryover of vulnerabilities from period to period. from the literature is then investigated. The applica-
The MCRAD charts for the data from organization tion involves moving centerline demerit (MCD) chart-
#2 (Table A1) and organization #3 (Table A2) are given ing from Nembhard and Nembhard (2000) and a
in Figures 5 and 6, respectively. Concerning organiza- slight extension of the residual charts from Runger
tion #2, the chart in Figure 5 shows no out-of-control and Willemain (1995). The MCD chart offers the
signals. For organization #3, the chart in Figure 6 shows advantage of charting the relatively intuitive demer-
an out-of-control signal in month three. This is as a its per host instead of residuals which motivated
result of the demerit per host for month 3 exceeding the extension to create moving centerline residual-
the residual-based chart control limit. From the per- based demerit (MCRD) charts. The proposed adjusted
spective of the authors, this signal appears to be a false demerit (AD) and hybrid moving centerline residual-
alarm. based and adjusted demerit (MCRAD) charting meth-
ods are based on using simulation to determine the
control limits. Average run length (ARL) comparisons
Conclusions were based on assumptions relevant to the three case
This article addresses the important problem of mon- studies. From this it is concluded that the proposed
itoring cyber vulnerabilities using statistical process AD and MCRAD offer improved ARL performance
control (SPC) methods. The problem is important compared withMCD andMCRD charts. TheMCRAD
because of the high and growing threat level associated charts in particular are recommended as a dashboard
with cyber-attacks and the widespread use of personal formonitoring cyber vulnerabilities. Also, the concepts
computers and other hosts. A process is proposed to of AD and MCRAD charts have applicability beyond
convert vulnerability data into demerits per host based cyber vulnerabilities and demerit charts tomany chart-
on the commonvulnerability scoring system (Mell et al. ing situations involving autocorrelation.
2007). The application of standard time series mod-
els to cyber vulnerability data from three organizations
is then described. The conclusion is that AR models
with a single lag, i.e., AR(1) processes accurately model About the authors
the three datasets and are possibly relevant for many Anthony Afful-Dadzie is a lecturer at the Univer-
other vulnerability modeling problems. The motiva- sity of Ghana Business School. He received his Ph.D.
tion for this choice relates to the carryover of the same from The Ohio State University in Industrial & Sys-
unpatched vulnerabilities from one period to the next. tems Engineering and his MPhil from Cambridge. His
Since the hosts aremonitored instead of parts, one does research and teaching interests include quality engi-
not have new units each period. neering, cyber security, and economic models.
QUALITY ENGINEERING 325
Theodore T. Allen is an associate professor of Inte- Cox, D. R. 1961. Prediction by exponentially weighted moving
grated Systems Engineering at The Ohio State Univer- averages and relatedmethods. Journal of the Royal Statistical
sity. He is a fellow of the American Society for Qual- Society. Series B 23(2): 414–422.
ity and the author of over 50 peer reviewed publica- Dodge, H. F. 1928. Amethod of rating a manufactured product.Bell System Technical Journal 7: 350–368.
tions including 2 textbooks.His research interests focus Dowdy, J. 2012. The cybersecurity threat to US growth and
on optimization with parametric uncertainty including prosperity. In Securing cyber space: a new domain for
optimal experimental design and cyber security main- national security, eds. N. Burns and J. Price, Washington,
tenance plan design. DC: Aspen Institute.
Enders, C. K. 2010. Applied missing data analysis, New York:
Guildford Press.
Funding Loredo, E. N., D. Jearkpaporn, and C. M. Borror. 2002. Model-
based control chart for autoregressive and correlated data.
This work was partially supported by National Science Founda- Quality and Reliability Engineering International 18: 489–
tion (NSF) grant #1409214. 496.
Montgomery, D. C. 2012. Introduction to statistical quality con-
trol. 7th ed. Hoboken, NJ: Wiley.
References Mell, P., K. Scarfone, and S. Romanosky. 2007. CVSS: A com-
plete guide to the common vulnerability scoring system ver-
Abedin, M., S. Nessa, E. Al-Shaer, and L. Khan. 2006. Vulnera- sion 2.0, In Forum of Incident Response and Security Teams.
bility analysis for evaluating quality of protection of security Montgomery, D. C., andMastrangelo, C.M. 1991. Some statisti-
policies. Proceedings of the 2nd ACM Workshop on Quality cal process control methods for autocorrelated data. Journal
of Protection, Alexandria, Virginia, pp. 49–52. of Quality Technology 23 (3): 179–193.
Ahmed, M. S., E. Al-Shaer, and L. Khan. 2008. A novel quan- Nembhard, D. A., and H.B. Nembhard. 2000. A demerit con-
titative approach for measuring network security. Proceed- trol chart for autocorrelated data. Quality Engineering 13
ings of the 27th IEEE INFOCOM 2008 Mini-Conference, (2): 179–190.
Phoenix, Arizona, pp. 1957–1965. Runger, G. C., and T. R. Willemain. 1995. Model-based and
Alwan, L. C., and H. V. Roberts. 1988. Time series modeling for model independent control of autocorrelated processes.
statistical process control. Journal of Business and Economic Journal of Quality Technology 27: 283–292.
Statistics 6(1): 87–95. Ponemon, L. 2010. Fifth Annual US Cost of Data Breach Study:
Box, G.E.P., G.M. Jenkins, and Reinsel, G. C. 1994. Time series Understanding Financial Impact, Customer Turnover
analysis: forecasting and control, 3rd ed. Englewood Cliffs, and Preventive Solutions, Traverse City, MI: Ponemon
NJ: Prentice Hall. Institute.
326 A. AFFUL-DADZIE AND T. T. ALLEN
Appendix
This appendix includes additional data about vulnerabilities from our case studies. Table A1 describes the demerits
for organization #2 and Table A2 describes the demerits for organization #3. Table A3 describes the score for the
highest vulnerability on each host for the 36 hosts which had vulnerabilities (out of 498) for organization #1.
Table A4 provides data on the worldwide known vulnerability counts and local patching percentages during the
28 month period for organization #1.
Table A. Tabulation of the hosts scanned successfully and numbers of vulnerabilities of different levels of severity for organization #.
Also included are demerit sums based on the counts.
Month Number of Hosts Low Medium High Critical Demerits Demerits Per Host
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
Table A. Tabulation of the hosts scanned successfully and numbers of vulnerabilities of different levels of severity for organization #.
Also included are demerit sums based on the counts.
Month Number of Hosts Low Medium High Critical Demerits Demerits Per Host
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
      . .
QUALITY ENGINEERING 327
Table A. Complete listing of CVSS scores for themost severe vulnerabilities on each host over the months. Hosts with no vulnerabilities were omitted from the list. Empty cells aremissing
data.
Host Number
#                                    
  . .  . .  . . .   .  . . . .
  .   . . .   .  . . .
 .  . .   .  . . .
 .  .  . . .  . . . . .
   . . . . . . .  . . . .
  . . . . .  . . . .
    . .  . . .  .  . .
   . . . . .  . . . . 
 .   .  .  . . .  . . .
   . . . . . .  . . . .
   . . . . . . . . . .
  .  . . .  . . . . . . . .
    . . . .  . . . . . . . . . . .
   . . . . . . . . . . . . .
   . . . . . . . .  . . . . .  
   . .    . .      .  
     . .    . .   .  .  . . 
  . . . . .  .  .  . . .
  . . . . . .  . . . . . .

  . . .  . .   .   . . .  .
  . . .  . .   .   . .  .
 . . . .  . . .       .  .
 . . . .  . .      .  .
 . .  . . .   . . . .  .
 . . .
 .  . . .
  . . . . . .  . . . .
328 A. AFFUL-DADZIE AND T. T. ALLEN
Table A. Monthly cumulative number of worldwide vulnerabilities and the local percentage of new vulnerabilities that is patched each
month, i.e., the number of vulnerabilities present in one month scan but missing in the next month scan divided by the total number of
vulnerabilities in the first month.
Month Cumulative Count of Distinct, Known Vulnerabilities Worldwide Vulnerability Patching Percentage
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
  .
  .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .
 , .