Browsing by Author "Abedu, S."

Now showing 1 - 2 of 2

Machine Learning Algorithms on Small-Sized Datasets in Software Effort Estimation: A Comparative Study
(University Of Ghana, 2021-01) Abedu, S.
Context: Software effort estimation is crucial in the software development process. Overestimating or underestimating the effort for a software project can have consequences on the bidding or development process for a company. Over the years, there has been growing research interest in machine learning approaches in software effort estimation. Though deep learning has been described as the state of the art in the field of machine learning, much has not been done to assess the performance of deep learning approaches in the field. Objective: This study defines a discretization scheme for setting a threshold for a small-sized dataset in software effort estimation. Also, it investigates the performance of selected machine learning models on the small-sized datasets. Method: Software effort estimation datasets were identified with their number of project instances and features from existing literature and ranked according to the number of project instances. Eubank’s optimal spacing theory was used to discretize the ranking of the project instances into three classes. The performance of selected conventional machine learning models and two deep learning models were assessed on the datasets classified as small-sized. The leave-one-out cross-validation as recommended by Kitchenham was adopted to assess the training and validation needs of the selected model. The performance of each model on the selected datasets was measured using the mean absolute error (MAE). Robust statistical tests were conducted using the Yuen’s t-test and Cliff’s delta effect size. Results: Results showed that the conventional machine learning models achieved improved prediction performance as compared to the deep learning models. Nonetheless, after applying early stopping regularisation to the deep learning models, it was found that the deep learning models achieved improved prediction accuracy than the conventional machine learning models but failed to outperform the Automatically Transformed Linear Model (ATLM). Conclusion: The study concluded that conventional machine learning approaches achieve better performance than deep learning approaches on small-sized datasets. However, applying the early stopping regularisation technique to the deep learning models can improve the performance of the deep learning model. Also, a given software effort estimation dataset can be classified as small-sized if the number of project instances in the dataset is less than 43. Keyword: Deep Learning, Machine Learning, Software Effort Estimation, Small-Sized Dataset
An Optimal Spacing Approach for Sampling Small-sized Datasets for Software Effort Estimation
(2023) Abedu, S.; Mensah, S.; Boafo, F.
Context: There has been a growing research focus in conventional machine learning techniques for software effort estimation (SEE). However, there is a limited number of studies that seek to assess the performance of deep learning approaches in SEE. This is because the sizes of SEE datasets are relatively small. Purpose: This study seeks to define a threshold for small-sized datasets in SEE, and investigates the performance of selected conventional machine learning and deep learning models on small-sized datasets. Method: Plausible SEE datasets with their number of project instances and features are extracted from existing literature and ranked. Eubank’s optimal spacing theory is used to discretize the ranking of the project instances into three classes (small, medium and large). Five conventional machine learning models and two deep learning models are trained on each dataset classified as small-sized using the leave-one-out cross-validation. The mean absolute error is used to assess the prediction performance of each model. Result: Findings from the study contradicts existing knowledge by demonstrating that deep learning models provide improved prediction performance as compared to the conventional machine learning models on small-sized datasets. Conclusion: Deep learning can be adopted for SEE with the application of regularisation techniques.