Hindawi Journal of Engineering Volume 2019, Article ID 1432597, 19 pages https://doi.org/10.1155/2019/1432597 Research Article Decision Support System (DSS) for Fraud Detection in Health Insurance Claims Using Genetic Support Vector Machines (GSVMs) Robert A. Sowah ,1 Marcellinus Kuuboore ,1 Abdul Ofoli,2 Samuel Kwofie,3 Louis Asiedu ,4 Koudjo M. Koumadi,1 and Kwaku O. Apeadu1 1Department of Computer Engineering, University of Ghana, PMB 25, Legon, Accra, Ghana 2Electrical and Computer Engineering Department, University of Tennessee, Chattanooga, TN, USA 3Department of Biomedical Engineering, University of Ghana, Legon, Accra, Ghana 4Department of Statistics and Actuarial Science, University of Ghana, Legon, Accra, Ghana Correspondence should be addressed to Robert A. Sowah; rasowah@ug.edu.gh Received 25 January 2019; Accepted 1 August 2019; Published 2 September 2019 Academic Editor: Kamran Iqbal Copyright © 2019 Robert A. Sowah et al. (is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Fraud in health insurance claims has become a significant problem whose rampant growth has deeply affected the global delivery of health services. In addition to financial losses incurred, patients who genuinely need medical care suffer because service providers are not paid on time as a result of delays in the manual vetting of their claims and are therefore unwilling to continue offering their services. Health insurance claims fraud is committed through service providers, insurance subscribers, and in- surance companies.(e need for the development of a decision support system (DSS) for accurate, automated claim processing to offset the attendant challenges faced by the National Health Insurance Scheme cannot be overstated. (is paper utilized the National Health Insurance Scheme claims dataset obtained from hospitals in Ghana for detecting health insurance fraud and other anomalies. Genetic support vector machines (GSVMs), a novel hybridized data mining and statistical machine learning tool, which provide a set of sophisticated algorithms for the automatic detection of fraudulent claims in these health insurance databases are used.(e experimental results have proven that the GSVMpossessed better detection and classification performance when applied using SVM kernel classifiers. (ree GSVM classifiers were evaluated and their results compared. Experimental results show a significant reduction in computational time on claims processing while increasing classification accuracy via the various SVM classifiers (linear (80.67%), polynomial (81.22%), and radial basis function (RBF) kernel (87.91%). 1. Introduction health care at the point of receiving service. Since its in- troduction, the scheme has grown to become a significant Low-income countries have made significant development instrument for financing healthcare delivery in Ghana. For policy frameworks for the sustainability of growth. (ese effective and efficient implementation, NHIS introduced a frameworks include healthcare delivery. Ghana is one of the tariff as a standardized primary fee for service rendered to its countries which aspired to provide effective and efficient beneficiaries at their affiliated health institutions. (is health care. In achieving this noble goal, the National Health standardized tool was reviewed in January 2007 by the Insurance Scheme (NHIS) was established by an Act of National Health Insurance Authority (NHIA), the governing Parliament, Act 650, in 2003 [1]. body of NHIS, to develop a new tariff for the NHIS due to its (e NHIS, as a social protection initiative, aims at expansion of service coverage. (e new tariff was developed providing financial risk protection against the cost of pri- based on a GDRG (Ghana Diagnostic Related Group) mary health care for residents of Ghana, and it has replaced system to include various clinical conditions and surgical the hitherto obnoxious cash and carry system of paying for procedures grouped under eleven Major Diagnostic 2 Journal of Engineering Categories (MDC) or clinical specialties, namely, Adult healthcare fraud as a federal criminal offense, with the Medicine, Pediatrics, Adult Surgery, Pediatrics Surgery, Ear, primary crime carrying a federal prison term of up to Nose and(roat (ENT), Obstetrics and Gynecology, Dental, 10 years in addition to significant financial penalties [8, 21]. Ophthalmology, Orthopedics, Reconstructive Surgery, and (is paper presents the hybridized approach of com- Out-Patients’ Department (OPD) [2]. (ese specialties bining genetic algorithms and support vector machines provide a guide to the claim adjudication process and the (GSVMs) to solve the health insurance claim classification operational mechanism for reporting claims as well as de- problem and eliminate fraudulent claims while minimizing termine the reimbursement process and create standards of conversion and labour costs through automated claim operation between NHIS and service providers [2]. processing. (e significant contributions of this paper are as (e GDRG code structure uses seven alphanumeric follows: (1) analysis of existing data mining and machine characters. (e first four characters represent the MDC or learning techniques (decision tree, Bayesian networks, clinical specialty. (e next two characters are numbers to Näıve–Bayes classifier, and support vector machines) for represent the number of GDRG within MDC. (e last fraud detection; (2) development of a novel fraud detection character (A or C) represent the age categories. An “A” model for insurance claims processing based on genetic represents those greater than or equal to 12 years, and C support vector machines; (3) design, development, and stands for those less than 12 years. deployment of a decision support system (DSS) which in- (e World Health Organization (WHO) provided an corporates the fraudulent claim detection model, business International Classification of Diseases (ICD-10) to meet the intelligence, and knowledge representation for claims pro- requirements for claim submission [3, 4], but NHIS utilized cessing at NHIS Ghana; (4) development of a user-friendly the GDRG codes since they have full control over them. graphical user interface (GUI) for the intelligent fraud de- Hence, the GDRG codes are used to develop the fraud tection system; and (5) evaluation of the health insurance detection model. claims fraud detection system using Ghana National Health A claim is a detailed invoice that service providers send Insurances Subscribers’ data from different hospitals. to the health insurer, which shows exactly what services a (e outline of the paper is as follows: Section 1 presents patient or patients received at the point of healthcare service the introduction and problem statement with research ob- delivery. Claim processing is the major challenge of pro- jectives. Section 2 outlines the systematic literature review on viders under the Health Insurance Scheme (HIS) globally various machine learning and data mining techniques for due to the excessive fraud in submitted claims and gaming of health insurance claims fraud detection. Section 3 gives the the system through well-coordinated schemes to siphon theoretical and mathematical foundations of genetic algo- money from its coffers [5–9]. rithms (GA), support vector machines (SVMs), and the Fraud in health care is classified into three categories, hybrid genetic support vector machines (GSVMs) in com- namely. (1) service provider (hospitals and physicians) bating this global phenomenon. Section 4 provides the fraud, (2) beneficiary (patients) fraud, and (3) insurer fraud proposed methodology for the GSVM fraud detection sys- [8, 10–13]. Several types of fraud schemes form the basis of tem, its design processes, and development. Section 5 this problem in health insurance programs worldwide.(ese comprises the design and implementation of the genetic are (1) billing for services not rendered (identity theft and support vector machines, while Section 6 presents the key phantom billing), (2) upcoding of services and items findings of the research with conclusions and recommen- (upcoding), (3) duplicate billing, (4) unbundling of claims dations for future work. (unbundling/creative billing), (5) medically unnecessary services (bill padding), (6) excessive services (bill padding), 2. Literature Review (7) kickbacks, (8) impersonation, (9) ganging, (10) illegal cash exchange for prescription, (11) frivolous use of service, Researching into health insurance claims fraud domain (12) insurance carriers’ fraud, (13) falsifying reimbursement, requires a clear distinctive view on what fraud is because it is (14) upcoding of service, and (14) insurance subscribers’ sometimes lumped together with abuse and waste. However, fraud, among others [9, 13–19]. fraud and abuse refer to a situation where healthcare service It was estimated conservatively that at least 3%, or more is paid for but not provided or reimbursement of funds is than $60 billion, of the US’s annual healthcare expenditure made to third-party insurance companies. Fraud and abuse was lost due to fraud. Other estimates by government and are further explained as healthcare providers receiving law enforcement agencies placed this loss as high as 10% or kickbacks, patients seeking treatments that are potentially $170 billion [9, 12]. In addition to financial loss, fraud also harmful to them (such as seeking drugs to satisfy addic- severely hinders the US healthcare system from providing tions), and the prescription of services known to be un- quality care to legitimate beneficiaries [9]. Hence, effective necessary [12, 17–19]. Health insurance fraud is an fraud detection is essential for improving the quality and intentional act of deceiving, concealing, or misrepresenting reducing the cost of healthcare services. information that results in healthcare benefits being paid to (e National Health Care Antifraud Association report an individual or group. in [12, 20] intimated that healthcare fraud strips nearly $70 Health insurance fraud detection involves account billion from the healthcare industry each year. In response to auditing and detective investigation. Careful account these realities, the Health Insurance Portability and Ac- auditing can reveal suspicious providers and policyholders. countability Act of 1996 (HIPAA) specifically established Ideally, it is best to audit all claims one-by-one. However, Journal of Engineering 3 auditing all claims is not feasible by any practical means. well as the processes of crossover, mutation, and inversion Furthermore, it is challenging to audit providers without that occurs in other genetics. Holland’s inversion demon- concrete smoking clues. A practical approach is to develop strated that, under certain assumptions, GA indeed achieves shortlists for scrutiny and perform auditing on providers an optimal balance [31–34]. In contrast with evolution and patients in the shortlists. Various analytical techniques strategies and evolutionary programming, Holland’s original can be employed in developing audit shortlists. goal was not to design algorithms to solve specific problems (e most common fraud detection techniques reported but rather to formally study the phenomenon of adaptation through the literature include the use of machine learning, as it occurs in nature and to develop ways in which the data mining, AI, and statistical methods. (e most cost- mechanisms of natural adaptation might be imported into saving model using the Näıve–Bayes algorithm was used to computer systems. Moreover, Holland was the first to at- create a subsample of 20 claims consisting of 400 objects tempt to put computational evolution on a firm theoretical where 50% of objects were classified as fraud and the other footing [35]. 50% classified as legal, which eventually does not give a clear Genetic algorithms operate through three main opera- picture of the decision if compared to other classifiers [22]. tors, namely, (1) reproduction, (2) crossover, and (3) mu- (e integration of multiple traditional methods has tation. A typical genetic algorithm requires (1) a genetic emerged as a new research area in combating fraud. (is representation of the solution domain and (2) a fitness approach could be supervised, unsupervised, or both, for function to evaluate the solution domain [31–34]. one method to depend on the other for classification. One Reproduction is controlled by crossover and mutation method may be used as a preprocessing step to modify the operators. Crossover is the process whereby genes are se- data in preparation for classification [9, 23, 24], or at a lower lected from the parent chromosomes, and new offsprings are level, the individual steps of the algorithms can be inter- produced. A mutation is designed to add diversity to the twined to create something fundamentally original. Hybrid population and ensure the possibility of exploring the entire methods can be used to tailor solutions to a particular search space. It replaces the values of some randomly se- problem domain. Different aspects of performance can be lected genes of a chromosome by some arbitrary new values specifically targeted, including classification ability, ease of [33, 35]. use, and computational efficiency [14]. During the reproduction stage, an individual is assigned Fuzzy logic was combined with neural networks to assess a fitness value derived from its raw performance measure and automatically classify medical claims [14]. (e concept given by the objective function. of data warehousing for data mining purposes in health care Support vector machine (SVM) as a statistical machine was applied to develop an electronic fraud detection ap- learning theory was introduced in 1995 by Vapnik and Corte plication to review service providers on behavioral heuristics as an alternative technique for polynomial, radial function, and compared to similar service providers. Australia’s and multilayer perceptron classifiers, in which the weights of Health Insurance Commission has explored the online the neurons are found by solving quadratic programming discounting learning algorithm to identify rare cases in (QP) problem with linearity, inequality, and equality con- pathology insurance data [10, 25–27]. straints rather than by solving a nonconvex, unconstrained Researchers in Taiwan developed a detection model minimization problem [36–39]. As a novel machine learning based on process mining that systematically identified technique for binary classification, regression analysis, face practices derived from clinical pathways to detect fraudulent detection, text categorization in bioinformatics and data claims [8]. mining, and outlier detection, SVMs face challenges when Results published in [28, 29] used Benford’s Law Dis- the dataset is very large due to the dense nature and memory tributions to detect anomalies in claims reimbursements in requirement of the quadratic form of the dataset. However, Canada. Despite the detection of some anomalies and ir- SVM is an excellent example of supervised learning that tries regularities, the ability to identify suspected claims is very to maximize the generalization by maximizing the margin limited to health insurance claim fraud detection since it and supports nonlinear separation using kernelization [40]. applies to service providers with payer-fixed prices. SVM tries to avoid overfitting and underfitting. (e margin Neural networks were used to develop an application for in SVM denotes the distance from the boundary to the detecting medical abuse and fraud for a private health in- closest data points in the feature space. surance scheme in Chile [30]. (e ability to process claims Given the claims training dataset correspondingly to xn : on real-time basis accounts for the innovative nature of this Rn ∈ F in the feature space F , the calculated linear hyper- method. (e application of association rule mining to ex- plane dividing them into two labelled classes yi (fraud and amine billing patterns within a particular specialist group to legal) can be mathematically obtained as detect these suspicious claims and potential fraudulent in- dividuals was incorporated in [9, 22, 30]. ωT nxi + b � 0, ω ∈ R , b ∈ R. (1) 3. Mathematical Foundations for Genetic Assuming the training dataset is correctly classified, as Support Vector Machines shown in Figure 1. (is means that the SVC computes the hyperplane to In the 1960s, John Holland invented genetic algorithms by maximize the margin separating the classes (legal claims and involving a simulation of Darwinian survival of the fittest as fraud claims). 4 Journal of Engineering K (x , x ) = ϕTi j (xi)ϕ(xj) zL n P � 0⟹ω � 􏽘 α y x , ξ > 1 i i iMargin = 2 (√wTw) zω i�1 Misclassified (6) point zL nP ξ < 1 � 0⟹ b � 􏽘 αiyzb i . Legitimate i�1 claims b (e problem defined in (5) is a quadratic optimization Support vector (QP) problem. Maximizing the primal problem L with Support vector Prespect to αi, subject to the constraints that the gradient of LP ξ = 0 with respect to w and b vanish, and that αi ≥ 0, gives the following two conditions: WTϕ(x) + b = –1 W Fraudulent n claims T ω � 􏽘 α y x ,W ϕ(x) + b = 0 i i i i�1 WTϕ(x) + b = +1 (7)n 􏽘 α y � 0. Figure 1: Standard formulation of SVM. i ii�1 Substituting these constraints gives the dual formulation In the simplest linear form, an SVC is a hyperplane that of the Lagrangian: separates the legal claims from the false claims with a n n n maximum margin. Finding this hyperplane involves 1Maximize LD(ω, b, α) � 􏽘 αi − 􏽘 􏽘 αiαjyiy 􏼐x x 􏼑,obtaining two hyperplanes parallel to it, as shown in Figure 1 α 2 j i ji�1 i�1 j�1 above, with an equal distance to the maximum margin. If all the training dataset satisfies the constraints as follows: n ωT ≤ 1 for 1 subject to 􏽘 α y � 0, α⎧⎨ x + b , y � + , i i i ≥ 0; i � 1, . . . , n.i i ⎩ (2) i�1ωTxi + b≥ − 1, foryi � − 1, (8) where ω is the normal to the hyperplane, |b|/‖ω‖ is the But the values of αi, ω, and b are obtained from these perpendicular distance from the hyperplane to the origin, respective equations, namely, and ‖ω‖ is the Euclidean norm of ω. (e separating hy- n perplane is defined by the plane ωTxi + b � 0 and the above ω � 􏽘 αiyixi, constraints in (2) are combined to form i�1 (9) T yi􏼐ω xi + b􏼑≥ 1. (3) 1 T Tb � 􏼐Min 2 i:y � +1ω x + Max � − 1ω x 􏼑. i i i:yi i (e pair of the hyperplanes that gives the maximum margin (c) can be found by minimizing ‖ω‖2, subject to Also, the Lagrange multiplier is computed using constraint in (9). (is leads to a quadratic optimization αi􏼐1 T − yi􏼐ω xi + b􏼑􏼑 � 0. (10) problem formulated as ω 2 Hence, this dual Lagrangian LD is maximized with re-‖ ‖ Minimize f(ω, b) � , spect to its nonnegative αi to give a standard quadratic2 (4) optimization problem. (e respective training vectors are called support vectors.With the input dataset xi as a nonzero subject to y Ti( ω xi + b􏼁≥ 1, ∀ i � 1, . . . , n. Lagrangian multiplier αi, (is problem is reformulated by introducing Lagrange yi􏼐ω T xi + b􏼑 � 1. (11) multipliers, αi(i � {1, . . . , n}) for each constraint and sub- tracting them from the function f(x) � ωTx + b. (e equation above gives the support vectors (SVs).i (is results in establishing the primal Lagrangian Despite that the SVM classifier can only have a linear function: hyperplane as its decision surface, its formulation can be 2 n extended to build a nonlinear SVM. SVMs with nonlinear||ω|| T L (ω, b, α) � + 􏽘􏼐α 􏼐i . . . y 􏽮ω x + b􏽯􏼑􏼑, decision surfaces can classify nonlinearly separable data byP 2 i i i i�1 (5) introducing a soft margin hyperplane, as shown in Figure 2: Introducing the slack variable into the constraints yields ∀i � 1, . . . , n. ωTxi + b≥ 1 − ξi,, foryi � +1, Taking the partial derivatives of L TP(ω, b, α) with respect ω x + b≥ − 1 + ξ , fory � − 1, (12) to ω, b& α, respectively, and applying the duality theory i i i yields ξi, ≥ 0, ∀i. Journal of Engineering 5 zL nP ξ > 1 Margin � ω zω u − 􏽘 αiyixiu � 0, u i�1 Misclassified point zLP � C − αiyi � 0, ξ < 1 zwu zL b P � C − αi − βi � 0, Support vector zξi Support vector ξ = 0 α T i􏼐yi􏽮ω xi + b􏽯􏼑 − 1 + ξi � 0, T yi􏼐ω xi + b􏼑 − 1 + ξi ≥ 0, W αi, βi, ξi ≥ 0, C, ξi ≥ 0, i � 1, 2, . . . , n and u � 1, 2, . . . , d, Figure 2: Linear separating hyperplanes for the nonseparable case zL nP of SVC by introducing the slack variable (ξ). � ωu − 􏽘 αiyixiu � 0,zωu i�1 (ese slack variables help to find the hyperplane that zLP � C − αiyi � 0, provides the minimum number of training errors. zwu Modifying equation (4) to include the slack variable (16) yields where the parameter d represents the dimension of the ||ω||2 n dataset. Minimize + C 􏽘 ξ , 2 i Observing the expressions obtained above after applyingα,b, ξi i�1 KKT conditions yields ξi � 0 for αi 0. (e parameter C is finite. (e larger the C value, the more significant the error. (is does not participate in the derivation of the sepa- (e Karush–Kuhn–Tucker (KKT) conditions are nec- rating function with αi � C and ξi > 0: essary to ensure optimality of the solution to a nonlinear α � 0, programming problem: i (18) T yi􏼐ω xi + b􏼑 − 1 + ξi � 0. T 􏼐yi􏼐ω xi + b􏼑􏼑 − 1≥ 0, i � 1, 2, . . . , l, ∀i, (14) Nonlinear SVM maps the training samples from the α T􏼐y 􏼐ω x + b􏼑􏼑 − 1 � 0, α ≥ 0, ∀i. input space into a higher-dimensional feature space via ai i i i kernel mapping function F. In the dual Lagrangian function, (e KKT conditions for the primal problem are used in the inner products are replaced by the kernel function: the nonseparable case, after which the primal Lagrangian 􏼐Φ( xi􏼁 ·Φ􏼐xj􏼑􏼑 � k􏼐xi, xj􏼑. (19)becomes Effective kernels are used in finding the separating hy- ||ω j||2 n ξ α ωT 1 ξ perplane without high computational resources. (e non-LP � + c 􏽘 i − 􏽘􏼐 i􏼐yi􏽮 xi + b􏽯􏼑 − + i􏼑2 linear SVM dual Lagrangian i�1 i�1 (15) n 1 n n n LD(α) � 􏽘 αi − 􏽘 􏽘 αiαjyiyjk􏼐xi · xj􏼑,2 (20) − 􏽘 β ξ i�1 i�1 j�1i i. i�1 subject to With βi as the Lagrange multipliers to enforce positivity n of the slack variables (ξi) and applying the KKT conditions 􏽘 αiyi � 0, 0≤ αi; i � 1, . . . , n. (21) to the primal problem yields i�1 6 Journal of Engineering Class 1 Margin Support vectors Hyperplane Class 2 Figure 3: Nonlinear separating hyperplane for the nonseparable case of SVM. (is is like that of the generalized linear case. Table 1: Summarized kernel functions used. (e nonlinear SVM separating hyperplane is illustrated in Figure 3 with the support vectors, class labels, andmargin. Kernel name Parameters Kernel function (is model can be solved by the method of optimization Radial basis function 2c ∈ R k(x , x ) � e− c‖xi − xj‖i j in the separable case. (erefore, the optimal hyperplane has (RBF) the following form: Polynomial function c ∈ R, d ∈ N k(x d i, xj) � (xi · xj + c) n f(x) � 􏽘 α y k( x , x􏼁 + b, (22) conceptual framework in Figure 4 and the flow charti i i i�1 implementation in Figure 5. where is the decision boundary from the origin. Hence, (e conceptual framework incorporates the designb separating newly arrived dataset implies that and development of key algorithms that enable submittedx claims data to be analysed and a model to be developed g(x) � sign(f(x)). (23) for testing and validation. (e flow chart presents the algorithm implemented based on theoretical foundations However, feasible kernels must be symmetrical, i.e., the in incorporating genetic algorithms and support vector matrix K with the component k(xi, xj) is positive semi- machines, two useful machine learning algorithms nec- definite and satisfies Mercer’s condition given in [39, 40]. essary for fraud detection. (eir combined use in the (e summarized kernel functions considered in this work detection process generates accurate results. (e meth- are given in Table 1. odology for the design and development of genetic (ese kernels satisfied Mercer’s condition with RBF support vector machines as presented above consists of or Gaussian kernel, which is the widely used kernel three (3) significant steps, namely, (1) data preprocessing, function from the literature. (e RBF has an advantage of (2) classification engine development, and (3) data adding a single free parameter c> 0, which controls the postprocessing. width of the RBF kernel as c � 1/2σ2, where σ2 is the variance of the resulting Gaussian hypersphere. (e linear kernel is given as k(xi, xj) � xi · xj. Consequently, 4.1. Data Preprocessing. (e data preprocessing is the first the training of SVMs used the solution of the QP opti- significant stage in the development of the fraud detection mization problem. (e above mathematical formulations system.(is stage involves the use of data mining techniques form the foundation for the development and de- to transform the data from its raw form into the required ployment of genetic support vector machines as the format to be used by the SVC for the detection and iden- decision support tool for detecting and classifying health tification of health insurance claims fraud. insurance fraudulent claims. In recent times, decision- (e data preprocessing stage involves the removal of making activities of knowledge-intensive enterprises unwanted customers, missing records, and data smooth- depend holistically on the successful classification of data ening. (is is to make sure that only useful and relevant patterns, despite time and computational resources re- information is extracted for the next process. quired to achieve the results due to the complexity as- Before the preprocessing, the data were imported from sociated with the dataset and its size. MS Excel CSV format into MySQL to a created database called NHIS. (e imported data include the electronic 4. Methodology for GSVM Fraud Detection Health Insurance Claims (e-HIC) data and the HIC tariff datasets as tables imported into the NHIS. (e e-HIC data (e systematic approach adopted for the design and de- preprocessing involves the following steps: (1) claims data velopment of genetic support vector machines for health filtering and selection, (2) feature selection and extraction, insurance claims fraud detection is presented in the and (3) feature adjustment. Journal of Engineering 7 Duplicated claims Upcoded claims Unbundled claims Uncovered claims Valid claims Fraud detection model Fraud detection classifier Legal claim Yes Success No Fraudulent claim End GSVM optimization Figure 4: Conceptual model design and development of the genetic support vector machines. (e WEKA machine learning and knowledge analysis “Attendance date,” “Hospital code,” “GDRG code,” “Service environment were used for feature selection and extraction, bill,” and “Drug bill.” (ese are the features selected, while the data processing codes are written in the MATLAB extracted, and used as the basis for the optimization problem technical computing environment. (e developed MAT- formulated below: LAB-based decision support engine was connected via MYSQL using the script shown in Figure 6. Minimize Totalcos t � f( Sbill, Dbill􏼁, Preprocessing of the raw data involves claims cost val- n idity checks.(e tariff dataset consists of the approved tariffs subject to 􏽘􏼐Si,bill􏼑≤ Gtariff, ∀ i, i � 1, 2, . . . , n, for each diagnostic-related group, which was strictly i�1 enforced to clean the data before further processing. Claims n are partitioned into two, namely, (1) claims with the valid 􏽘􏼐Di,bill􏼑≤ Dtariff, ∀j, j � 1, 2, . . . , n, and approved cost within each DRG and (2) claims with i�1 invalid costs (those above the approved tariffs within each Sbill is the Service bill, DRG). Dbill is theDrug bill. With the recent increase in the volume of real dataset (24) and dimensionality of the claims data, there is the urgent need for a faster, more reliable, and cost-effective data (e GA e-HIC dataset is subjected to SVM training, mining technique for classification models. (e data mining using 70% of the dataset and 30% for testing as depicted in techniques require the extraction of a smaller and optimized Figure 7. set of features that can be obtained by removing largely (e e-HIC dataset, which passes the preprocessing stage, redundant, irrelevant, and unnecessary features for the class that is the valid claims, was used for SVM training and prediction [41]. testing.(e best data, those that meet the genetic algorithm’s Feature selection algorithms are utilized to extract a criteria, are classified first. Each record of this dataset is minimal subset of attributes such that the resulting prob- classified as either “Fraudulent Bills” or “Legal Bills.” ability distribution of data classes is close to the original (e same SVM training and the testing dataset are distribution obtained using all attributes. Based on the idea applied to the SVM algorithm for its performance analysis. of survival of the fittest, a new population is constructed to (e inbuilt MATLAB code for SVM classifiers was in- comply with fittest rules in the current population, as well as tegrated as one function for linear, polynomial, and RBF the offspring of these rules. Offsprings are generated by kernels. (e claim datasets were partitioned for the classifier applying genetic operators such as crossover and mutation. training, testing, and validation, 70% of the dataset was used (e process of offspring generation continues until it evolves for training, and 30% used for testing. (e linear, poly- a population N where every rule in N satisfies the fitness nomial, and radial basis function SVM classification kernels threshold. With an initial population of 20 instances, gen- were used with ten-fold cross validation for each kernel and eration continued till the 20th generation with crossover the results averaged. For the polynomial classification kernel, probability of 0.6 and mutation probability of 0.033. a cubic polynomial was used. (e RBF classification kernel (e selected features based on genetic algorithms are used the SMO method [40]. (is method ensures the 8 Journal of Engineering Data Validation Training dataset Testing dataset Feature subset selection by Roulette wheel Training dataset Testing dataset Generation with feature with feature selection subset selection subset Population Training SVM classifier Mutation Trained SVM classifier Crossover Evaluate fitness Recombination No Success Yes Optimized SVM Fraud hyperparameter (C, y) store in PPD detectionmodel Figure 5: Flow chart for design and development of the genetic support vector machines. handling of large data sizes as it does data transformation (ii) True fraudulent bills: this consists of the number of through kernelization. After running many instances and “Fraudulent Bills,” which were correctly classified as varying parameters for RBF, a variance of 0.9 gave better “Fraudulent Bills” by the classifier. results as it corresponded well with the datasets for the (iii) False legal bills: this consists of the bills classified as classification. After each classification, the correct rate is “Legal Bills” even though they are not. (at is, these calculated and the confusion matrix extracted. (e confu- are wrongly classified as “Legal Bills” by the kernel sion matrix gives a count for the true legal, true fraudulent, used. false legal, false fraudulent, and inconclusive bills. (iv) False, fraudulent bills: the classifier also wrongly (i) True legal bills: this consists of the number of “Legal classified bills as fraudulent. (e confusion matrix Bills,” which were correctly classified as “Legal Bills” gives a count of these wrongly or incorrectly clas- by the classifier. sified bills. Journal of Engineering 9 Figure 6: MATLAB-based decision support engine connection to the database. GA Creation of Claims Feature e-HIC claims record filtering and selection data database selection andextraction SVM training Feature Data Data and testing adjustment normalization preprocessing dataset Figure 7: Data preprocessing for SVM training and testing. (v) Inconclusive bills: these consist of nonclassified TP TP bills. sensitivity � � . (27)TP + FN PP (e correct rate: this is calculated as the total number of correctly classified bills, namely, the true legal bills and true fraudulent bills, divided by the total number of bills used for 4.1.2. Specificity. (is is the statistical measure of the pro- the classification: portion of negative fraudulent claims which are correctly number of TLB + number of TFB classified: correct rate � , (25) total number of bills (TB) TN TNspecificity � � . (28) TN + FP NP where TLB: True Legal Bills; TFB: True Fraudulent Bills. TP + TN accuracy � (1 − Error) � � Pr(C). TP + TN + FP + FN 4.2. GSVM Fraud Detection System Implementation and 26 Testing. (e decision support system comprises four main( ) modules integrated together, namely, (1) algorithm imple- (e probability of a correct classification. mentation using MATLAB technical computing platform, (2) development of graphical user interface (GUI) for the HIC fraud detection system which consists of uploading and 4.1.1. Sensitivity. (is is the statistical measure of the pro- processing of claims management, (3) system administrator portion of actual fraudulent claims which are correctly management, and (4) postprocessing of detection and detected: classification results. 10 Journal of Engineering Data flowchart for NHIS claims fraud detection system process NHIS FRAUD DETECTION SYSTEM USING GSVM Exploratory Upload in GUI dataDEVELOPED GUI analysis Detected results Model & GSVM NHIS claims dataset algorithm MySQL database Engine Autocreation of results database Figure 8: System implementation architecture for HICFDS. Figure 9: Detection results control portal interface. (e front end of the detection system was developed PHP and Perl programming languages [42]. XAMPP stands using XAMPP, a free and open-source cross-platform web for Cross-Platform (X), Apache (A), MariaDB (M), PHP (P), server solution stack package developed by Apache Friends and Perl (P). (e Health Insurance Claims Fraud Detection [42], consisting mainly of the Apache HTTP Server, System (HICFDS) was developed using MATLAB technical MariaDB database, and interpreters for scripts written in the computing environment with the capability to connect to an Journal of Engineering 11 Grouping of detected frauds type in sample datasets Table 2: Sample data size and the corresponding fraud types. 400 Sample data size Fraud types 350 100 300 500 750 1000 Duplicate claims 2 4 4 4 0 300 Uncovered service claims 4 56 65 109 406 250 Overbilling claims 44 60 91 121 202 Unbundled claims 0 18 10 54 0 200 Upcoded claims 2 34 50 122 6 Impersonation claims 0 2 10 23 34 150 Total suspected claims 52 174 230 433 648 100 Table 3: Summary performance metrics of SVM classifiers on 50 samples sizes. 0 Description Duplicated Uncovered Overbilled Unbundled Upcoded Impersonation claims service claims claims claims claims claims Kernels Average used Data size accuracy Sensitivity (%) Specificity (%)Fraud types rate (%) 100 dataset 750 dataset 100 71.43 60.00 77.78 300 dataset 1000 dataset 300 72.73 84.21 0.00 500 dataset Linear 500 91.80 97.78 75.00 Figure 10: Fraud type distribution on the sample data sizes. 750 84.42 95.00 47.06 1000 82.95 85.42 80.00 100 71.43 66.67 72.73 external database and a graphical user interface (GUI) for 300 72.73 88.24 20.00 enhanced interactivity with users. (e HICFDS consists of Polynomial 500 96.72 100.00 86.67750 80.52 96.36 40.91 several functional components, namely, (1) function for 1000 84.71 83.67 86.11 computing the descriptive statistics of raw and processed data, (2) preprocessing wrapper function for data handling 100 71.43 57.14 85.71Radial 300 95.45 95.00 100.00 and processing, and (3) MATLAB functions for GA Opti- basis 500 99.18 100.00 96.30 mization and SVM Classification processes. (e HICFDS function 750 82.56 96.88 40.91 components are depicted in Figure 8. 1000 90.91 100.00 82.98 (e results generated by the HICFDS are stored in MYSQL database. (e results comprise three parts, which 100 are legitimate claims report, fraudulent claims, and statistics of the results. (ese results are shown in Figure 9. (e 90 developed GUI portal for the analysis of results obtained 80 from the classification of the submitted health insurance claims is displayed in Figure 9. By clicking on the fraudulent 70 button in the GUI, a pop-up menu generating the labelled 60 Figure 10 is obtained for the claims dataset. It shows the 50 grouping of detected fraudulent claim types in the datasets. For each classifier, a 10-fold cross validation (CV) of 40 hyperparameters (C, γ) from Patients Payment Data (PPD) 30 was performed. (e performance measured on GA optimi- zation tested several hyperparameters for the optimal SVM. 20 (e SVC training aims for the best SVC parameters (C, c) in 10 building the HICFD classifier model. (e developed classifier is evaluated using testing and validation data. (e accuracy of 0 0 20 40 60 80 100 120 140 160 the classifier is evaluated using cross validation (CV) to avoid overfitting of SVC during training data. (e random search Legal bills (training) Fraudulent bills (classified) method was used for SVC parameter training, where expo- Legal bills (classified) Support vectors nentially growing sequences of hyperparameters as a Fraudulent bills (training)(C, c) practical method to identify suitable parameters were used to Figure 11: Linear SVM on a sample claims dataset. identify SVC parameters and obtain the best CV accuracy for the classifier claims data samples. Random search slightly method cheaper than a grid search. Experimentally, 10-fold varies from grid search. Instead of searching over the entire CV was used as the measure of the training accuracy, where grid, random search only evaluates a random sample of points 70% of each sample was used for training and the remaining on the grid. (is makes the random search a computational 30% used for testing and validation. Sample dataset 12 Journal of Engineering 100 90 80 70 60 50 40 30 20 10 0 0 20 40 60 80 100 120 140 160 Legal bills (training) Fraudulent bills (classified) Legal bills (classified) Support vectors Fraudulent bills (training) Figure 12: Polynomial SVM on a sample claims dataset. 100 Table 5: Confusion matrix for SVM classifiers. 90 Description Data size TP TN FP FN Correct rate 100 3 7 2 2 71.4 80 300 16 0 3 3 71.3 Linear 500 88 24 8 2 91.8 70 750 57 8 9 3 84.4 60 1000 41 32 8 7 83.0 100 2 8 3 1 71.4 50 300 15 1 4 2 72.3 Polynomial 500 92 26 4 0 96.7 40 750 53 91 13 2 80.5 30 1000 41 31 5 8 85.2 100 4 6 1 3 71.4 20 300 19 2 0 1 95.5 Radial basis function 500 95 26 1 0 99.2 10 750 62 9 13 2 92.2 1000 41 39 8 0 91.9 0 0 20 40 60 80 100 120 140 160 evaluate the efficiency of the proposed HICFDS (classifier) Legal bills (training) Fraudulent bills (classified) are taken exclusively from NHIS headquarters and covers Legal bills (classified) Support vectors different hospitals within the Greater Accra Region of Fraudulent bills (training) Ghana. Sampled data with the corresponding fraud types Figure 13: RBF SVM on a sample claims dataset. after the analysis are shown in Table 2. In evaluating the classifiers obtained with the analyzed methods, the most widely employed performance measures Table 4: Averages performance analysis of SVM classifiers. are used: accuracy, sensitivity, and specificity with their concepts of True Legal (TP), False Fraudulent (FN), False Description Accuracy Sensitivity Specificity Legal (FP), and True Fraudulent (TN). (is classification is Linear 80.67 84.48 55.97 shown in Table 3. Polynomial 81.22 86.99 61.28 (e figures below show the SVC plots on the various RBF 87.91 89.80 81.18 classifiers (linear, polynomial, and RBF) on the claims datasets (Figures 11–13). 4.3. Data Postprocessing: Validation of Classification Results. From the performance metrics and overall statistics (e classification accuracy of the testing data is a gauge to presented in Table 4, it is observed that the support vector evaluate the ability of the HICFDS to detect and identify machine performs better classification with an accuracy of fraudulent claims. (e testing data used to assess and 87.91% using the RBF kernel function, followed by the Journal of Engineering 13 Average computational time Table 6: Cost analysis of tested claims dataset. 500 Sample Raw cost data of claims Valid claims Deviation Percentage(R) (R–V) difference 400 size GHC cost (V) 100 20791.83 8911.72 11880.11 133.31 300 31496.05 15622.7 15873.35 101.60 300 500 58218.65 27480.96 30737.69 111.85 750 88394.07 31091.58 57302.49 184.30 1000 117448.2 47943.38 69504.82 144.97 200 polynomial kernel with 81.22% accuracy and hence linear SVM emerging as the least performance classifier with an 100 accuracy of 80.67%. (e confusion matrix for the SSVM 0 200 400 600 800 1000 classifiers is given in Table 5, where i utilized in the com- Sample data size putation of the performance metric of the SVM classifiers. Figure 14: Computational time on the tested sample dataset. For the purpose of statistical and machine learning classi- fication tasks, a confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of 800 the performance of a supervised learning algorithm. Besides classification, the amount of time required in processing the sample dataset is also an important con- 600 sideration in this research. From the above, the compared computational time shows that increase in the size of the sample dataset also increases the computational time needed 400 to execute the process regardless of the machine used, which is widely expected.(is difference in time costs is merely due to the cause of training the dataset. (us, as global data 200 warehouse increases, more computational resources will be needed in machine learning and data mining research pertaining to the detection of insurance fraud as depicted in 0 0 200 400 600 800 1000 Figure 14 relating the average computational time and Sample data size sample data. Figure 15 summarizes the fraudulent claims detected Figure 15: Detected fraud trend on the tested claims dataset. during the testing of the HICFD with the sample dataset used. As the sample data size increases, the number of 406 suspected claims increases rapidly based on the various 400 fraudulent types detected. Benchmarking HICFD analysis ensures understanding 350 of HIC outcomes. From the chart above, an increase in the claims dataset has a corresponding increase in the number of 300 suspected claims.(e graph in Figure 16 shows a sudden rise in the level of suspected claims on tested 100 datasets rep- 250 resenting 52% of the sample dataset, after which it continues 202 to increase slightly on the suspected numbers of claims by 200 2% to make up 58% on the tested data size of 300 claims. Among these fraud types, the most frequent fraudulent 150 act is uncovered services rendered to insurance subscribers 121122 109 by service providers. It accounts for 22% of the fraudulent 100 91 claims as to the most significant proportion of the total 60 56 65 50 54 health insurance fraud on the total tested dataset. Conse- 50 44 34 23 34 quently, overbilling of submitted claims is recorded as the18 2 4 0 2 0 4 2 4 10 10 4 0 0 6 second fraudulent claims type representing 20% of the total0 100 dataset 300 dataset 500 dataset 750 dataset 1000 dataset sample dataset used for this research. (is is caused by service providers billing for a service greater than the ex- Duplicated claims Unbundled pected tariff to the required diagnoses. Listing and billing for Uncovered service Upcoded a more complex or higher level of service by providers are Overbilled claims Impersonation done to boost their financial income flow unfairly in the Figure 16: Chart of types of fraudulent claims. legitimate claims. Suspected claims Time (s) 14 Journal of Engineering Table 7: Comparison of results of GSVM with decision trees and Näıve–Bayes. Description of the algorithm used Claims dataset Accuracy obtained with the corresponding Average value over differentdataset datasets 100 71.43 300 95.45 GSVM with radial basis function (RBF) kernel 500 99.18 87.906 750 82.56 1000 90.91 100 62 300 78 Decision trees 500 77.8 74.44 750 82.7 1000 71.7 100 50 300 61 Naı̈ve–Bayes 500 56.8 59.1 750 60.7 1000 67 Figure 17: Classification Learner App showing the various algorithms and percentage accuracies in MATLAB. Journal of Engineering 15 Figure 18: Algorithmic runs on 500-claim dataset. Moreover, some illicit service providers claim to have As observed in Table 6, the cost of the claims bill in- rendered service to insurance subscribers on costly services creases proportionally with an increase in the sample size of instead of providing more affordable ones. Claims prepared the claims bill. (is is consistent with an increase in on expensive service rendered to insurance subscribers fraudulent claims as sample size increases. From Table 6, we represent 8% of the fraudulent claims detected on the total can see the various costs for each raw record (R) of sample sample dataset. Furthermore, 3.1% of service procedure that claim dataset. Valid claims bill after processing dataset, the should be considered an integral part of a single procedure variation in the claims bill (R–V), and their percentage known as the unbundle claims contributed to the fraudulent representation as well are illustrated in Table 6. (ere is a claims of the set of claims dataset used as the test data. Due to 27% financial loss of the total submitted claim bills to in- the insecure process for quality delivery of healthcare ser- surance carriers.(is loss is the highest rate of loss within the vice, insurance subscribers are also contributing to the 750 datasets of submitted claims. fraudulent type of claims by loaning their ID cards to family Summary of results and comparison with other machine members of the third party who pretend to be owners and learning algorithms such as decision trees and Näıve–Bayes request for the HIS benefits in the healthcare sector. Du- is presented in Table 7. plicated claims as part of the fraudulent act recorded the (e MATLAB Classification Learner App [43] was minimum rate of 0.5% of contribution to fraudulent claims chosen to validate the results obtained above. It enables ease in the whole sample dataset. of comparison with the different methods of classification 16 Journal of Engineering Figure 19: Algorithmic runs on 750-claim dataset. algorithms implemented. (e data used for the GSVM were 5. Conclusions and Recommendations subsequently used in the Classification Learner App, as shown below. (is work aimed at developing a novel fraud detection Figures 17 and 18 show the classification learner app model for insurance claims processing based on genetic with the various implemented algorithms and corre- support vector machines, which hybridizes and draws on sponding accuracies in MATLAB technical computing the strengths of both genetic algorithms and support language environment and the results obtained using the vector machines. (e GSVM has been investigated and 500-claim dataset, respectively. Figures 19 and 20 depict the applied in the development of HICFDS. (is paper used subsequent results when the 750- and 1000-claim datasets GSVM for detection of anomalies and classification of were utilized for the algorithmic runs and reproducible health insurance claims into legitimate and fraudulent comparison, respectively. (e summarized results and ac- claims. SVMs have been considered preferable to other curacies are illustrated in Table 7. (e summarized results in classification techniques due to several advantages. (ey Table 7 portray the effectiveness of our proposed approach of enable separation (classification) of claims into legitimate using the genetic support vector machines (GSVMs) for and fraudulent using the soft margin, thus accommodating fraud detection of insurance claims. From the result, it is updates in the generalization performance of HICFDS. evident that GSVM achieves a higher level of accuracy With other notable advantages, it has a nonlinear dividing compared to decision trees and Näıve–Bayes. Journal of Engineering 17 Figure 20: Algorithmic runs on the 1000-claim dataset. hyperplane, which prevails over the discrimination within used. (is is noted in the cluster of the claims dataset the dataset. (e generalization ability of any newly arrived (MDC specialty). When the sample dataset is much data for classification was considered over other classifi- skewed to one MDC specialty (e.g., OPDC), the perfor- cation techniques. mance of the SVCs could tune to one classifier, especially (us, the fraud detection system provides a combination the linear SVM, as compared to others. Hence, the be- of two computational intelligence schemes and achieves haviour of the dataset has a significant impact on clas- higher fraud detection accuracy. (e average classification sification results. accuracies achieved by the SVCs are 80.67%, 81.22%, and Based on this work, the developed GSVM model was 87.91%, which show the performance capability of the SVCs tested and validated using HIC data. (e study sought to model.(ese classification accuracies are obtained due to the obtain the best performing classifier for analyzing the health careful selection of the features for training and developing insurance claims datasets for fraud. (e RBF kernel was the model as well as fine-tuning the SVCs’ parameters using adjudged the best with an average accuracy rate of 87.91%. theV-fold cross-validation approach.(ese results are much (e RBF kernel is therefore recommended. better than those obtained using decision trees and Näıve–Bayes. (e average sample dataset testing results for the proposed SVCs vary due to the nature of the claims dataset 18 Journal of Engineering Data Availability [13] A. Abdallah, M. A. Maarof, and A. Zainal, “Fraud detection system: a survey,” Journal of Network and Computer Appli- (e data used in this study are available upon request. (e cations, vol. 68, pp. 90–113, 2016. data can be uploaded when required. [14] A. K. I. Hassan and A. Abraham, “Computational intelligence models for insurance fraud detection: a review of a decade of Conflicts of Interest research,” Journal of Network and Innovative Computing,vol. 1, pp. 341–347, 2013. (e authors declare that they have no conflicts of interest. [15] E. Kirkos, C. Spathis, and Y. Manolopoulos, “Data Mining techniques for the detection of fraudulent financial state- ments,” Expert Systems with Applications, vol. 32, no. 4, Acknowledgments pp. 995–1003, 2007. [16] H. Joudaki, A. Rashidian, B. Minaei-Bidgoli et al., “Using data (e authors of this paper wish to acknowledge the Carnegie mining to detect health care fraud and abuse: a review of Corporation of New York through the University of Ghana, literature,” Global Journal of Health Science, vol. 7, no. 1, under the UG-Carnegie Next Generation of Academics in pp. 194–202, 2015. Africa project, for organizing Write Shops that led to the [17] V. Rawte and G. Anuradha, “Fraud detection in health in- timely completion of this paper. surance using data mining techniques,” in Proceedings of the 2015 International Conference on Communication, In- formation & Computing Technology (ICCICT), pp. 1–5, Supplementary Materials Mumbai, India, January 2015. [18] J. Li, K.-Y. Huang, J. Jin, and J. Shi, “A survey on statistical (e material consists of MS Excel file data collected from methods for health care fraud detection,” Health Care some NHIS-approved hospitals in Ghana concerning in- Management Science, vol. 11, no. 3, pp. 275–287, 2008. surance claims. Its insurance claims dataset used for testing [19] Q. Liu and M. Vasarhelyi, “Healthcare fraud detection: a and implementation. (Supplementary Materials) survey and a clustering model incorporating geo-location information,” in Proceedings of the 29th World Continuous References Auditing and Reporting Symposium, Brisbane, Australia,November 2013. [1] G. of Ghana, National Health Insurance Act, Act 650, 2003, [20] T. Ekin, F. Leva, F. Ruggeri, and R. Soyer, “Application of Ghana, 2003. Bayesian methods in detection of healthcare fraud,” Chemical [2] Capitation, National Health Insurance Scheme, 2012, http:// Engineering Transactions, vol. 33, pp. 151–156, 2013. www.nhis.gov.gh/capitation.aspx. [21] Home—(e NHCAA, https://www.nhcaa.org/. [3] ICD-10 Version:2016, http://apps.who.int/classifications/icd10/ [22] S. Viaene, R. A. Derrig, and G. Dedene, “A case study of browse/2016/en. applying boosting naive Bayes to claim fraud diagnosis,” IEEE [4] T. Olson, Examining the Transitional Impact of ICD-10 on Transactions on Knowledge and Data Engineering, vol. 16, Healthcare Fraud Detection, College of Saint Benedict/Saint no. 5, pp. 612–620, 2004. John’s University, Collegeville, MN, USA, 2015. [23] Y. Singh and A. S. Chauhan, “Neural networks in data [5] News Ghana, NHIS Manager Arrested for Fraud | News mining,” Journal of Feoretical and Applied Information Ghana, News Ghana, Accra, Ghana, 2014, https://www. Technology, vol. 5, no. 1, pp. 37–42, 2009. newsghana.com.gh/nhis-manager-arrested-for-fraud/. [24] D. Tomar and S. Agarwal, “A survey on data mining ap- [6] BioClaim Files, http://www.bioclaim.com/Fraud-Files/. proaches for healthcare,” International Journal of Bio-Science [7] Graphics Online, Ghana news: Dr. Ametewee Defrauds NHIA and Bio-Technology, vol. 5, no. 5, pp. 241–266, 2013. of GH¢415,000—Graphic Online, Graphics Online, Accra, [25] P. Vamplew, A. Stranieri, K.-L. Ong, P. Christen, and Ghana, 2015, http://www.graphic.com.gh/news/general- P. J. Kennedy, “Data mining, and analytics 2011,” in Pro- news/dr-ametewee-defrauds-nhia-of-gh-415-000.html. ceedings of the Ninth Australasian Data Mining Conference [8] W.-S. Yang and S.-Y. Hwang, “A process-mining framework (AusDM’A), Australian Computer Society, Ballarat, Australia, for the detection of healthcare fraud and abuse,” Expert December 2011. Systems with Applications, vol. 31, no. 1, pp. 56–68, 2006. [26] K. S. Ng, Y. Shan, D. W. Murray et al., “Detecting non- [9] G. C. van Capelleveen, Outlier Based Predictors for Health compliant consumers in spatio-temporal health data: a case Insurance Fraud Detection within U.S. Medicaid, University of study from medicare Australia,” in Proceedings of the 2010 Twente, Enschede, Netherlands, 2013. IEEE International Conference on Data Mining Workshops, [10] Y. Shan, D. W. Murray, and A. Sutinen, “Discovering in- pp. 613–622, Sydney, Australia, December 2010. appropriate billings with local density-based outlier detection [27] J. F. Roddick, J. Li, P. Christen, and P. J. Kennedy, “Data method,” in Proceedings of the Eighth Australasian Data mining and analytics 2008,” in Proceedings of the 7th Aus- Mining Conference, vol. 101, pp. 93–98, Melbourne, Australia, tralasian Data Mining Conference (AusDM 2008), vol. 87, December 2009. pp. 105–110, Glenelg, South Australia, November 2008. [11] L. D. Weiss and M. K. Sparrow, “License to steal: how fraud [28] C. Watrin, R. Struffert, and R. Ullmann, “Benford’s Law: an bleeds America’s health care system,” Journal of Public Health instrument for selecting tax audit targets?,” Review of Man- Policy, vol. 22, no. 3, pp. 361–363, 2001. agerial Science, vol. 2, no. 3, pp. 219–237, 2008. [12] P. Travaille, R.M.Müller, D.(ornton, and J. VanHillegersberg, [29] F. Lu and J. E. Boritz, “Detecting fraud in health insurance “Electronic fraud detection in the U.S. Medicaid healthcare data: learning to model incomplete Benford’s Law distribu- program: lessons learned from other industries,” in Proceedings tions,” in Machine Learning, J. Gama, R. Camacho, of the 17th Americas Conference on Information Systems P. B. Brazdil, A. M. Jorge, and L. Torgo, Eds., pp. 633–640, (AMCIS), pp. 1–11, Detroit, Michigan, August 2011. Springer, Berlin, Heidelberg, 2005. Journal of Engineering 19 [30] P. Ortega, C. J. Figueroa, and G. A. Ruz, “A medical claim fraud/abuse detection system based on data mining: a case study in Chile,” DMIN, vol. 6, pp. 26–29, 2006. [31] T. Bäck, J. M. De Graaf, J. N. Kok, and W. A. Kosters, Feory of Genetic Algorithms, World Scientific Publishing, River Edge, NJ, USA, 2001. [32] M. Melanie, An Introduction to Genetic Algorithms, (e MIT Press, Cambridge, MA, USA, 1st edition, 1998. [33] D. Goldberg, Genetic Algorithms in Optimization, Search and Machine Learning, Addison-Wesley, Reading, MA, USA, 1989. [34] J. Wroblewski, “(eoretical foundations of order-based ge- netic algorithms,” Fundamental Informaticae, vol. 28, no. 3-4, pp. 423–430, 1996. [35] J. H. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Con- trol, and Artificial Intelligence, MIT Press, Cambridge, MA, USA, 1st edition, 1992. [36] V. N. Vapnik, Fe Nature of Statistical Learning Feory, Springer, New York, NY, USA, 2nd edition, 2000. [37] J. Salomon, Support Vector Machines for Phoneme Classifi- cation, University of Edinburgh, Edinburgh, UK, 2001. [38] J. Platt, Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines, Microsoft Research, Redmond, WA, USA, 1998. [39] J. Platt, “Using analytic QP and sparseness to speed training of support vector machines,” in Proceedings of the Advances in Neural Information Processing Systems, Cambridge, MA, USA, 1999. [40] C.-W. Hsu, C.-C. Chang, and C.-J. Lin, A Practical Guide to Support Vector Classification, Data Science Association, Tai- pei, Taiwan, 2003. [41] I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” Journal of Machine Learning Research, vol. 3, pp. 1157–1182, 2003. [42] D. Dvorski, Installing, Configuring, and Developing with XAMPP, Ski Canada Magazine, Toronto, Canada, 2007. [43] MATLAB Classification Learner App: MATLAB Version 2019a, Mathworks Computer Software Company, Natick, MS, USA, 2019, http://www.mathworks.com/help/stats/classification- learner-app.html. International Journal of Rotating Advances in Machinery Multimedia En Jougrnail onf eering The Scientific Journal ofWorld Journal Sensors Hindawi Hindawi Publishing Corporation Hindawi Hindawi Hindawi www.hindawi.com Volume 2018 hwtwtpw:/./hwinwdwaw.hii.ncodmawi.com Volume 20183 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 Journal of Control Science and Engineering Advances in Civil Engineering Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 Submit your manuscripts at www.hindawi.com Journal of Journal of Electrical and Computer Robotics Engineering Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 VLSI Design Advances in OptoElectronics International Journal of Modelling & International Journal of Simulation Aerospace Navigation and Observation in Engineering Engineering Hindawi Hindawi Hindawi Hindawi Volume 2018 Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com www.hindawi.com Volume 2018 International Journal of International Journal of Antennas and Active and Passive Advances in Chemical Engineering Propagation Electronic Components Shock and Vibration Acoustics and Vibration Hindawi Hindawi Hindawi Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018