An automatic software vulnerability classification framework using term frequency-inverse gravity moment and feature selection

dc.contributor.authorMensah, S.
dc.contributor.authorChen, J.
dc.contributor.authorKudjo, P.K.
dc.contributor.authorBrown, S.A.
dc.contributor.authorAkorfu, G.
dc.date.accessioned2020-07-02T13:51:04Z
dc.date.available2020-07-02T13:51:04Z
dc.date.issued2020-05-15
dc.descriptionResearch Articleen_US
dc.description.abstractVulnerability classification is an important activity in software development and software quality main- tenance. A typical vulnerability classification model usually involves a stage of term selection, in which the relevant terms are identified via feature selection. It also involves a stage of term-weighting, in which the document weights for the selected terms are computed, and a stage for classifier learning. Generally, the term frequency-inverse document frequency (TF-IDF) model is the most widely used term-weighting metric for vulnerability classification. However, several issues hinder the effectiveness of the TF-IDF model for document classification. To address this problem, we propose and evaluate a general framework for vulnerability severity classification using the term frequency-inverse gravity moment (TF-IGM). Specifi- cally, we extensively compare the term frequency-inverse gravity moment, term frequency-inverse doc- ument frequency, and information gain feature selection using five machine learning algorithms on ten vulnerable software applications containing a total number of 27,248 security vulnerabilities . The exper- imental result shows that: (i) the TF-IGM model is a promising term weighting metric for vulnerability classification compared to the classical term-weighting metric, (ii) the effectiveness of feature selection on vulnerability classification varies significantly across the studied datasets and (iii) feature selection improves vulnerability classification.en_US
dc.description.sponsorshipNational Natural Science Foundation of China (NSFC U1836116 , 6170022430 and61872167 ), the Project of Jiangsu Provincial Six Talent Peaks (Grant num- ber: XXJS-016 ), The Postdoctoral Science Foundation of China 1112019T120399 and theGraduateResearchInnovation Projectof Jiangsu Province (Grant numbers KYCX17 1807 ).en_US
dc.identifier.urihttp://ugspace.ug.edu.gh/handle/123456789/35446
dc.language.isoenen_US
dc.publisherJournal of Systems and Softwareen_US
dc.relation.ispartofseries167;
dc.subjectSoftware vulnerabilityen_US
dc.subjectClassificationen_US
dc.subjectFeature selectionen_US
dc.subjectMachine learning algorithmsen_US
dc.subjectSeverityen_US
dc.subjectTerm-weightingen_US
dc.titleAn automatic software vulnerability classification framework using term frequency-inverse gravity moment and feature selectionen_US
dc.typeArticleen_US

Files

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.6 KB
Format:
Item-specific license agreed upon to submission
Description: