Abedu, S.Mensah, S.Boafo, F.2023-09-262023-09-26202310.18293/SEKE2023-176http://ugspace.ug.edu.gh:8080/handle/123456789/40111Research ArticleContext: There has been a growing research focus in conventional machine learning techniques for software effort estimation (SEE). However, there is a limited number of studies that seek to assess the performance of deep learning approaches in SEE. This is because the sizes of SEE datasets are relatively small. Purpose: This study seeks to define a threshold for small-sized datasets in SEE, and investigates the performance of selected conventional machine learning and deep learning models on small-sized datasets. Method: Plausible SEE datasets with their number of project instances and features are extracted from existing literature and ranked. Eubank’s optimal spacing theory is used to discretize the ranking of the project instances into three classes (small, medium and large). Five conventional machine learning models and two deep learning models are trained on each dataset classified as small-sized using the leave-one-out cross-validation. The mean absolute error is used to assess the prediction performance of each model. Result: Findings from the study contradicts existing knowledge by demonstrating that deep learning models provide improved prediction performance as compared to the conventional machine learning models on small-sized datasets. Conclusion: Deep learning can be adopted for SEE with the application of regularisation techniques.enDeep learningConventional Machine learningSoftware effort estimationSmall-sizedOptimal spacing theoryAn Optimal Spacing Approach for Sampling Small-sized Datasets for Software Effort EstimationArticle