Multiple Bug Detection And Effort Estimation Framework For Open-Source Projects

Thumbnail Image

Date

2021-09

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

University Of Ghana

Abstract

Bug reports are essential in the development and maintenance of software. Bug tracking systems allow testers to submit bug reports which allow for report analysis and assignment of reports to fixers to address them. A given bug x is described as multiple bug when it is reported by more than two bug reporters. It is described as a duplicate bug when it was reported by two reporters. In a given pool of bug reports from a tracking system, estimating the effort required to identify multiple bug is a challenge, and hence the need to conduct this study. Thus, a plausible solution based on an effort estimation framework to detect multiple bugs will reduce the effort software testers spend in analyzing bug reports and also improve software reliability and productivity. Although several studies are attempting to solve the problem, there is the need to introduce an effort estimation framework to detect multiple bugs in software projects, specifically open-source projects. However, the following constraints exist when detecting multiple bugs in open-source projects: - (1) a large number of existing bug reports, and (2) much effort is required when detecting and analyzing multiple bug reports. This study seeks to develop a framework to detect multiple bugs and estimate the effort required in identifying such bugs in open-source projects. This study implements the bugdetector tool, which uses bug information and code features to find similar bugs. It will first extract features from bug information in a bug tracking system, next it locates bug methods in source code and extracts bug method code features. It calculates similarities between each overridden and overload method, and finally, it determines which method may cause potentially related or similar bugs. Empirical analysis was conducted on bug reports from two open-source projects, namely Mozilla Firefox and Eclipse. Thus, empirical analysis was conducted on the extracted bug reports by the bugdetector tool. The analysis was conducted using Deep learning algorithms (LSTM, Bidirectional LSTM and CNN) and conventional machine learning algorithms (SVM and Random Forest). Accuracy, Precision, Recall, and F1-score metrics were used to evaluate the models' performance. Estimating the required effort for identifying multiple bugs was done using a proposed effort estimation metric. Empirical result shows that the deep learning method, namely the Bidirectional LSTM algorithm yielded improved performance for multiple bug detection across the two-studied datasets. Thus, for Mozilla Firefox, the Bidirectional LSTM yielded the best performance accuracy (71.09%), precision (68.30%), and recall (45.7%). For Eclipse, Bidirectional LSTM dominated the best performance about accuracy (82.6%) and F1-score (50.9%). The effort required for detecting multiple bugs on average ranges from 1255.7 to 1383.2 days for the studied Eclipse bug repository, and 1049.8 to 1139.2 days for the studied Mozilla Firefox bug repository. The study concluded that the deep learning method has a better tendency in detecting multiple bugs in open-source projects as compared to the conventional machine learning approach. An effort estimation metric is introduced to compute the effort required to detect multiple bugs in open-source projects. This will assist software testers/fixers to differentiate between severity levels of the detected bugs based on the respective efforts computed. Keywords: Duplicate bugs, Effort estimation, Bug detection, Deep learning, Open-source projects.

Description

MPhil. Computer Science

Keywords

Open-Source Projects, Bug Detection, Duplicate bugs, Effort estimation, Deep learning

Citation

Endorsement

Review

Supplemented By

Referenced By