The Impact Of Cluster Resolution Feature Selection On Pattern Recognition And Classification For Detecting Sudan Dye Adulteration In Palm Oil.
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Microchemical Journal
Abstract
This study evaluates the performance of some commonly used chemometric and machine learning techniques
such as principal component analysis (PCA), artificial neural network (ANN), k-nearest neighbors (KNN), logistic
regression discriminant analysis (LRDA), partial least squares discriminant analysis (PLSDA), support vector
machine (SVM), and gradient boosted decision tree (GBDT) on HATR − FTIR data for detecting Sudan dye
adulteration in palm oil. We employed the Icoshift for data alignment and Savitzky-Golay smoothing to enhance
the data quality. Cluster resolution feature selection (CRFS) selected 2.39 % of 3351 features. Using only the 80
selected features PCA models showed a clear separation between adulterated and pure palm oil samples and an
improvement in explained variance which hitherto was not observed. LRDA, PLSDA and SVM showed improved
training TPR, ACC and MCC after feature selection. KNN showed improvement all model quality parameters after
feature selection.
Description
Research Article
Citation
Kwao, J. K., Mingle, C., Addotey, J. N., Opuni, K. F., & Adutwum, L. A. (2025). The impact of cluster resolution feature selection on pattern recognition and classification for detecting Sudan dye adulteration in palm oil. Microchemical Journal, 208, 112433.
