International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 The Use of Machine Learning Algorithms in the Classification of Sound: A Systematic Review Akon O. Ekpezu, University of Ghana, Ghana https://orcid.org/0000-0002-9502-1052 Ferdinand Katsriku, University of Ghana, Ghana Winfred Yaokumah, University of Ghana, Ghana* https://orcid.org/0000-0001-7756-1832 Isaac Wiafe, University of Ghana, Ghana https://orcid.org/0000-0003-1149-3309 ABSTRACT This study is a systematic review of literature on the classification of sounds in three domains: bioacoustics, biomedical acoustics, and ecoacoustics. Specifically, 68 conferences and journal articles published between 2010 and 2019 were reviewed. The findings indicated that support vector machines, convolutional neural networks, artificial neural networks, and statistical models were predominantly used in sound classification across the three domains. Also, the majority of studies that investigated medical acoustics focused on respiratory sounds analysis. Thus, it is suggested that studies in biomedical acoustics should pay attention to the classification of other internal body organs to enhance diagnosis of a variety of medical conditions. With regard to ecoacoustics, studies on extreme events such as tornadoes and earthquakes for early detection and warning systems were lacking. The review also revealed that marine and animal sound classification was dominant in bioacoustics studies. KEywoRdS Acoustic Signals, Artificial Intelligence, Classification, Deep Learning, Environmental Monitoring, Machine Learning, Medical Diagnosis, Security Surveillance, Sound INTRodUCTIoN Sound or acoustic signals are gradually gaining research popularity as a tool for environmental monitoring, security surveillance, diagnoses of diseases, critical information infrastructure protection, and data transmission (Bourouhou et al., 2019; Ibrahim et al., 2018; Loey et al., 2020; Luque et al., 2018). Sound is considered as the second most important sense after sight that is capable of carrying information about the environment (Perr, 2005). Although sound varies depending on seasons, time, geographic location as well as propagation medium, it is considered as one of the most significant signals used to monitor and detect changes in the environment. Accordingly, the ability to differentiate (classify) one sound or acoustic signal type from another is pertinent that, if accomplished, would DOI: 10.4018/IJSSMET.298667 *Corresponding Author This article published as an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) which permits unrestricted use, distribution, and production in any medium, provided the author of the original work and original publication source are properly credited. 1 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 result in significant progress in application areas such as early warning disaster management, medical diagnosis (Loey, Naman, & Zayed, 2020), and action or event detection. Recent studies have shown that machine learning (ML) algorithms are efficient in the domains of image and speech recognition, natural language processing, medical imaging, data extraction (Dwivedi et al., 2019; Malfante et al., 2018; Tatoian & Hamel, 2018) and text classification (Elfergany & Adl, 2020; Sangwan & Bhatnagar, 2020). Classification aims at predicting accurately the target object and differentiating one object class from the other given a set of data. It is predominantly performed using selected features that feed classifier tools such as machine learning and neural networks (Mitilineos et al., 2018). In particular, sound classification is aimed at classifying audio segments into specific classes which requires the understanding of the fundamental structure of frequencies in acoustic signals (Dwivedi et al., 2019). This is commonly addressed with features used in speech and music processing such as MFCC (Mel frequency cepstral coefficient), linear prediction coefficients (LPC), linear prediction cepstral coefficients (LPCC) and fast Fourier transforms (Briggs et al., 2012; Chu et al., 2009; Davis & Suresh, 2019; Karbasi et al., 2011; Mitilineos et al., 2018; Oletic et al., 2012; Pramono et al., 2017; Sengupta et al., 2016). A variety of machine learning techniques have also been adopted to obtain robust sound classification models. Considering the plethora of acoustic features and machine learning (ML) algorithms coupled with the nature of sound, it is imperative to offer researchers an indication of the major research trends and methodologies that can assist in designing and developing automatic sound classification systems. Accordingly, this study provides summaries of the existing literature on algorithms for the classification of sound and analyzes the use of ML in the various sound classification tasks. The specific objective of the review is to identify: (a) publication patterns in acoustic signal classification, (b) trends in the use of ML in acoustic signal/sound classification, (c) open questions and challenges in the use of ML algorithms in acoustic signal classification, and (d) research gaps in the subject area. KNowLEdGE GAP ANd REVIEw QUESTIoNS Previous studies that conducted reviews on sound or acoustic signal classifications employed artificial intelligence (AI) techniques. These studies were evaluated using Greenhalgh’s (1997) evaluation criteria. The criterion evaluated systematic reviews by accessing the relevance of the review question, the search strategy, the methodological quality, and the sensitivity and presentation of results and findings. Although findings from the evaluation of existing reviews in the subject area indicated that studies available provided summaries and reproducible review methodologies, they focused predominantly on the classification of biomedical acoustic signals, particularly, on heart sounds (Dwivedi et al., 2019), lung sound (Palaniappan et al., 2013), respiratory sound (Pramono et al., 2017), and speech sound disorder in children (Wren et al., 2018). Thus, systematic review on the classification of sounds is lacking. Considering the various applications of sound in various activities, the lack of sufficient summaries justifies the need for a systematic review of sound classification. Recently, ML algorithms have been used for various classification tasks (Hao, Weiss, & Brown, 2018). However, due to the plethora of ML algorithms (Hlioui, Aloui, & Gargouri, 2020: Salama & Hassanien, 2014), choosing a suitable algorithm for a specific classification task becomes difficult. Hence, there is the need to identify open questions, publication trends, and current approaches in algorithm usage that will assist researchers to position appropriately new research activities in sound classification and detection. To address this, this review examines two broad issues. The research questions stated in Table 1 are divided into two categories. The Category A consists of questions that seek to provide an overview of publication trends whereas the Category B seeks to provide a good methodological background for a broader work by identifying research gaps and current methodologies in the domain. Specifically, it is expected that the study will provide pertinent information regarding patterns in publications since 2010, academic outlets that are dominant and attracting more studies, and countries that have focused on acoustic signal classification most within the specified period. 2 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 As mentioned earlier, questions in Category B will provide information on techniques and domains of application. Accordingly, the review questions will seek to summarize information on application domains that have dominated artificial intelligence (AI) for acoustics signal studies, the most used datasets, the machine learning techniques adopted, and the various evaluation methods that are mostly adopted. Table 1 provides a summary of the proposed review questions and the corresponding rationale for posing them. Table 1. Research Questions and Objectives Research Questions Objectives Category A • What are the yearly publication trends? • To identify the frequency of primary studies per year. • What journal has the highest number of • To identify the frequency of publications per journal. publications? • To identify authors who are consistent in writing on the subject • What is the frequency of authors? area. • What is the country’s origin of authors’ affiliated • To identify countries with the highest number of publications. institutions? Category B • What kind of sound is classified? • To identify the dominant and less dominant types of classified • What is the format of the sound? sounds. • What are the sample rates of the audio • To identify predominantly used audio formats for classification. recordings? • To determine the maximum audio frequency that can be • What datasets were used for the classification reproduced. and (or) evaluation? • To identify datasets that are available for public use. • What are the various application domains? • To identify domains in which sound classification is predominantly performed. • What ML techniques have been used for sound • To identify predominantly used ML techniques and performance classification? metrics in sound classification. • What measures are used to evaluate model performance? REVIEw APPRoACH A systematic search of the literature was carried out in two databases: Scopus and Acoustical Society of America (ASA). Scopus was selected because it is arguably the most extensive abstract and citation database for academic publications, whereas the ASA publications database was selected purposefully since it is the leading source of theoretical and experimental studies in acoustics-related studies. Publications were extracted from the selected databases using key search terms and their possible combination using the logical ‘and’ operator. The key search terms included classification, sound, acoustic signals, machine learning, deep learning, and artificial intelligence. The combination of the search terms produced the following search phrases (SP): SP1 Classification of sound and machine learning SP2 Classification of sound and deep learning SP3 Classification of sound and artificial intelligence SP4 Classification of acoustic signals and machine learning SP5 Classification of acoustic signals and deep learning SP6 Classification of acoustic signals and artificial intelligence 3 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Inclusion and Exclusion Criteria A set of specific eligibility criteria were defined and followed to limit the collection of articles to only those that fit with the research objectives. A suitability check of returned articles was performed after examining the title and removing duplicate papers. Only articles in which the aim, classification techniques, and/or results were explicitly stated in the abstract were considered. The inclusion and exclusion criteria are as follows: C1 Include only open-access journal articles and peer-reviewed conference papers written in English and published between the years 2010 and 2019. C2 Include articles whose titles contain keywords like classification and acoustic signals or sound and machine learning or deep learning or whose title suggests sound classification using artificial intelligence. C3 Exclude duplicate papers from the search results. C4 Exclude papers whose abstracts do not explicitly state the classification techniques and/or results of the evaluation metrics used. C5 Exclude by document type i.e., exclude secondary studies, books, thesis, reports, and letters. Study Selection and data Extraction The six search phrases earlier mentioned were used to search the Scopus and ASA databases. The protocol for this systematic review has three main steps. In the first step, the retrieved articles were analyzed with an initial exclusion criterion (C1 to C2). In the second step, eligible articles were then exported to a spreadsheet (MS Excel) for further exclusion by duplicate, abstract, and type of study (C3, C4, and C5). ASA database does not have the export feature, hence this phase of exclusion was done directly from the browser and manually documented. The third step entailed downloading and reading eligible articles to extract relevant data concerning the review questions. The extracted data was collated in a spreadsheet for ease of use and analysis. Figure 1 is a flow diagram showing the results of the screening after each stage of exclusion or inclusion. Figure 1. Flow diagram of study screening/selection 4 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 As shown in Figure 1, the initial search output contained 1,295 journal and conference articles published from 2010 to 2019. Out of these, 181 studies were included after an initial screening by title and keywords and a total of 90 articles were obtained after the removal of duplicates. Furthermore, 22 studies were excluded based on abstract and document type. Finally, 48 journal papers and 20 conference articles were selected and used in the study. REVIEw RESULTS ANd FINdINGS Publication Trends (Category A) This section presents the findings on the publication frequency, distribution of journals, authors, and their country of origin. As mentioned earlier, a total of sixty-eight (68) conference proceedings and journal articles were identified at the end of the selection process. This comprised 20 (29%) conference publications and 48 (71%) journal articles. An analysis of the publication frequency (Figure 2) from 2011 to 2016 recorded between 2 and 4 publications per year. However, from 2017 there was a change in trend such that both conference and journal articles were recorded each year. Also, there was an upsurge in the publications from 2016 with a double leap from 9 publications in 2017 to 18 publications in 2018. Considering the upsurge in publication, the popularity of artificial intelligence as well as the emergence of sound as an alternative means of environmental monitoring, it is envisaged that the area of sound classification will draw more research attention. Figure 2. Publications by year 5 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Further, an examination of the sources of the included studies showed that the 68 studies were distributed among 35 Scopus indexed conference proceedings and journals, and 69% of the studies were published in the Journal of America Society of Acoustics (JASA). The journals and their corresponding number of included studies are JASA (24), Applied Sciences (3), Sensors (3), IEEE Access (2), APSIPA Transaction on Signal and Information Processing (1), Biomedical Journal, ELSEVIER - Computers & Electronics in Agriculture, Electronics (1), Elektronika ir Elektrotechnika (1), Eurasip Journal on Image & Video processing (1), Expert Systems with Applications (1), Frontiers in Neuroscience (1), IEEE Signal Processing Letters (1), IEICE Transactions on Information anhd Systems (1), International Journal of Fuzzy logic & Intelligent Systems (1), International Journal of online & biomedical engineering (1), International Journal of online engineering (1), Noise mapping (1), PeerJ (1), PLoS ONE (1). While the conference proceedings include ACM International Conference Proceeding Series (2), ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing (2), Computing in Cardiology (2), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Bioinformatics) (2), Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (2), 8th International Conference on Health Informatics (1), International Conference on Machine Learning and Applications, ICMLA 2012 (1), International Conference on Pattern Recognition (1), Proceedings of the International Conference on Neural Networks (1), MATEC Web of Conferences (1), IEEE International Workshop on ML for signal processing, MLSP (1), Procedia Computer Science (1), 2019 IEEE International Symposium on Signal Processing and Information Technology, ISSPIT (1), Journal of Physics: Conference Series (1), 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology Society EMBC (1). Authors and country origin. An analysis of the authors and their country origin (the country in which their affiliated institution is located) was performed to identify the author or group of authors who are consistent in writing on the subject area, as well as countries with the leading number of publications. With the number of authors per article ranging from 2 to 9, a headcount of the various authors showed that 229 authors wrote the 68 selected papers. Furthermore, 6 groups of leading authors in the subject area (i.e., authors with more than one publication) were identified (see Table 2). And it was observed that out of the 6 groups, 4 groups of authors were all interested in classifying sounds from animals (bioacoustics) and their publications were all journal articles. The 5th group was interested in classifying environmental sound and they had both conference and journal articles, while the 6th group focused on heart sounds with conference articles only. Furthermore, the authors’ country of origin (i.e., address of the authors) and the frequency of publications per year were identified. According to Figure 3, it is observed that the authors were from 31 different countries with the UK and USA leading the trend by 16% and 12% respectively. China, France, India, and Korea made up 8%,7%, and 6% respectively. Portugal and Spain 5%, Germany 4%, while the other 22 countries make up 37% of the publication trend. Further, the highest number of publications in 2019 was from India (5), the highest in 2018 was from China (4) and Korea (4), and the highest in 2017 was from UK (5). Again, China (2) and the USA (3) had the highest publications in 2016 and 2012 respectively. Other years had one study per country. The countries with only one study include Ireland, Pakistan, Saudi Arabia, Morocco, Italy, Hong Kong, Jordan, Taiwan, Brazil, Estonia, Switzerland, Singapore, Sweden, and Austria. It is worth noting that publications from the UK are most consistent since 2012, each year there is at least one publication coming from the country. 6 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Table 2. Leading authors Groups Paper Title Year Journal Study 1 An approach for automatic classification of 2018 The Journal of the A9 grouper vocalizations with passive acoustic Acoustical Society of monitoring. America Classification of red hind grouper call 2019 The Journal of the A12 types using a random ensemble of stacked Acoustical Society of autoencoders. America 2 Dynamic time warping and sparse 2015 The Journal of the A18 representation classification for birdsong Acoustical Society of phrase classification using limited training America data. A robust automatic birdsong phrase 2016 The Journal of the A14 classification: A template-based approach. Acoustical Society of America 3 Domestic cat sound classification using 2018 Applied Sciences A30 learned features from deep neural nets. Domestic cat sound classification using 2018 International Journal of A44 deep learning. Fuzzy Logic and Intelligent Systems 4 Non-sequential automatic classification of 2018 Expert systems with A31 anuran sounds for the estimation of climate Applications change indicators. Temporally aware algorithms for the 2018 PeerJ A33 classification of anuran sounds 5 Unsupervised Feature Learning for Urban 2015 ICASSP, IEEE International A55 Sound Classification Conference on Acoustics, Speech and Signal Processing Deep Convolutional Neural Networks and 2017 IEEE Signal Processing A46 Data Augmentation for Environmental Letter Sound 6 Heart murmur classification with feature 2010 2010 Annual International A67 selection Conference of the IEEE Engineering in Medicine and Biology Society Heart murmur classification using 2010 Proceedings - International A58 complexity signatures Conference on Pattern Recognition Sound Categories, datasets, Classification Techniques, and Performance Metrics (Category B) The discussions in this section are results obtained in line with Category B of the research questions. For ease of reference, the selected articles have been numbered in the order in which they were selected - A1 to A68 and will be used accordingly in further ana lysis (see Appendix for a list of included studies). 7 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Figure 3. Distribution of publications by authors country of origin Classification of sound and data sources. Sound produced by plants, animals, and humans are numerous and it varies on land, air, and water depending on the medium of propagation, seasons, activities, and geographic location. There are three main sources of sound - Anthrophony (sounds made or caused by humans) such as shipping and drilling noise; Geophony (sound from the environment) such as sea surface noise like the breaking of waves, icebreaking, raindrops; and Biophony (sounds from animals) such as vocalizations of mammals, anurans, groupers. This section highlights the different kinds of sounds that were classified, data sources, sample rates, and availability of datasets as found in the selected articles. As shown in Table 3, 31 studies focused on classifying sounds caused by animals (Biophony), 19 classified sounds caused/made by human beings (Anthrophony), and the other 18 classified sounds from a combination of the three sounds categories (anthrophony, geophony, and Biophony). In the biophony category, researchers were predominantly interested in classifying sounds from different species of Odontocetes and Mysticetes (marine mammals). While some of the researchers were interested in automatically detecting, classifying, and localizing call types from different species (Guilment et al., 2018; Halkias et al., 2013; Roch et al., 2011; Shamir et al., 2014), others were only interested in classifying vocalizations of humpback whales, whistles & pulse of dolphins, song cycles of whales and echolocation clicks of beaked whales (Allen et al., 2017; LeBien & Ioup, 2018; Ou et al., 2013; Parada & Cardenal-Lopez, 2014). Classified sounds caused by humans (anthrophony) included respiratory sounds, human voice disorder, blast sound, snore sound, and baby cry. The baby cry was classified to identify the health state of a baby (i.e., need, pain, discomfort, or a medical condition) (Aucouturier et al., 2011) while snoring as a medical condition was classified as a means to automatically differentiate types of snore sounds (Amiriparian et al., 2017). Similarly, to automatically detect medical conditions such as cardiovascular diseases and respiratory tract diseases, EEG signals and heart sounds were classified to identify wheezes, crackles, murmur, extra-systole, normal and abnormal heartbeats. It would appear that apart from organs involved in respiratory and cardiovascular activity no other sound from internal organs of the human body was of interest. It might be instructive therefore to consider extending work to cover sounds from other internal human organs such as for example the intestines. Such work might be useful in understanding ailments that affect the digestive system. Also, Blast 8 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 sound was classified to differentiate between blast noise and non-blast noise (Cvengros et al., 2017). Sounds from the environment were predominantly classified to differentiate indoor, outdoor, natural, vocal, and non-vocal human sounds. Table 3. Summary of classified sounds and datasets Sound source/ type Link to dataset/name of the dataset Article code Biophony (sounds from animals) 1 Marine mammals – DEFLOHYDRO, OHAS-ISBIO, DCLDE 2015, Auau Channel 2002, A1, A4, Whales and Dolphins French Frigate Shoals (FFS), CEMMA datasets (http://www.cemma.org), A13, https://data.gulfresearchinitiative.org A15, A16, A17, A19, A20, A22 2 Birds http://www.animalsoundarchive.org/Refsys/Statistics.Php, Birdcalls71, A2, A5, Flight calls, Anuran, CAVI, and CUB-20002011 standard dataset A14, A18, A21, A41, A7 3 Fish and Groupers http://www.fishbase.org/and http://www.dosits.org/, SEACOUSTIC2014 A3, A9, A12, A37 4 Primates- Marmosets and http://home.ustc.edu.cn/~zyj008/background_noise.wav., http:// A8, Monkeys marmosetbehavior.mit.edu A11, A35 5 Amphibians – Frogs and Recordings from commercial compact discs (CD), recordings from A24, Anuran natural habitat, http://www.fonozoo.com/ A31, A33, A52 6 Domestic/farm animals Online video sources including YouTube, Kaggle challenge database and A25, – dog, cat, sheep, cattle, Flicker, and https://github.com/kyb2629/pdse. A32, Maremma sheepdogs A30, A44 Anthrophony (sounds made/caused by humans) 7 Military blast sound LRPE, East South Central, APG, SERDP-PITT, MCBC-PITT, New York A6 (Fort Drum) 8 Baby cry, human voice N/M A24, disorders A48 Table 3 continued on next page 9 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Table 3 continued Sound source/ type Link to dataset/name of the dataset Article code 9 Respiratory/heart/ https://github.com/yaseen21khan/Classification-of-heart-sound- A27, lung sound, EEG signal-using-multiple-features-/blob/master/README.md, https:// A28, (electroencephalogram) physionet.org/challenge/2016/. https://www.cs.colostate.edu/eeg, (the A29, signals Physionet database), Int. Conf. on Biomedical Health Informatics A34, (ICBHI) scientific challenge database, Dataset B- PASCAL classifying A38, heart sounds challenge, live recordings from patients using Bluetooth A45, stethoscope A47, A51, A53, A54, A56, A58, A61, A66, 10 Snore sound Munich-Passau snore sound corpus A63 Geophony (sound from the environment) and combination of various sound sources 11 Cinematic sound 44-film dataset A57 12 Oil, water, and gas Life recordings A68 13 Environmental sound Real-world computing partnership (RWCP) sound scene dataset, A26, DCASE challenge dataset, FindSounds database, Urban-sound 8k A35, dataset, TIDIGITS dataset, ESC-10, ESC-50 dataset, freeseound.org, A36, TUT database for acoustic scene classification & sound event detection, A39, YouTube videos A40, A42, A43, A46, A49, A50, A55, A59, A60, A62, A64, A65, Environmental sounds classified are both indoor and outdoor sounds including air-conditioner, car horns, children playing, dog bark, drilling, engine idling, gunshot, jackhammers, siren, street music, running water, applause, footsteps, crowd, musical instruments, thunder, sea waves, etc. 10 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Sample rate, audio format, and signal representation. The sample rate which is the number of samples of audio carried per second ranged from 0.1kHz to 192kHz. The dominantly used sample rates lied between 22 and 44.1kHz. Out of the 13 classified sounds sound categories identified from the included primary studies, the dominant audio format used was the .wav format. Others included mp3 (Parada & Cardenal-Lopez, 2014; Shamir et al., 2014), ARFF (Zhang et al., 2016), and HDF5 format (Bold et al., 2019). Furthermore, signals and audio files were predominantly visually represented as spectrograms. Spectrograms are graphical or visual representations of sound with frequency on the vertical axis, time on the horizontal axis, and a dimension of color that represents the intensity of the sound at each time-frequency location. According to (Amiriparian et al., 2017; Halkias et al., 2013; Malfante et al., 2018; Oikarinen et al., 2019; Ou et al., 2013), the classification of spectrograms as natural images allows it to be processed with available image processing tools. Additionally, it helps in removing the effect of background disturbances on the classification process (Thakur et al., 2019). Features extracted from spectrograms usually outperform hand-crafted features since spectrograms do not discriminate phrase classes with similar dominant frequency trajectories (Tan et al., 2015). However, disparate images in which the axes carry the same meaning irrespective of their location (i.e., the axes are shared weights across the vertical and horizontal dimensions), the axes of a spectrogram do not carry the same meaning (it has time and frequency as the vertical and horizontal dimensions). Sources of data. To identify publicly available datasets, the datasets used in the reviewed articles were divided into two categories: pre-existing sound datasets and live recordings. i. Pre-existing sound datasets: This was made up of sound collected from past experiments, past projects, or existing sound databases, 28 datasets were identified from this category. Out of the 28, only 18 were stated to be publicly available, while the availability of others was either not mentioned or stated as not available due to licensing or privacy issues. ii. Life recordings: This category of datasets was generated by the researchers specifically for their research. It is made up of recordings of the subject of interest either in their natural habitat (Allen et al., 2017; Briggs et al., 2012; Ibrahim et al., 2019; LeBien & Ioup, 2018; Roch et al., 2011; Shamir et al., 2014), or in a controlled environment such as recording rooms and laboratories (Giret et al., 2011; Oikarinen et al., 2019; Zhang et al., 2018). In some cases, a recording device was attached to the animals while for humans a Bluetooth stethoscope was used to obtain recordings of heart sounds. Other life recordings were collected with any of the following recording units, hydrophones, passive acoustic monitoring (PAM) systems, short-gun microphones, etc. attached to divers, seafloor moving boats, or sinks. In all, 24 datasets were privately generated and only 5 are available to the public. An important consideration in research into sound classification is the availability of datasets. Easy access to high quality dataset is critical to research success in the field. With a total of 52 mentioned data sources from both categories, only 24 are reported to be publicly available, this is a confirmation of the challenges of limited datasets stated by researchers in sound classification. Whilst researchers may be able to readily generate or record some forms of sounds that can be used in research including outdoor sounds like barking of dogs, some other forms of sound may not be so easily generated or recorded, for example volcanic activity or sound from an impending tsunami. Distribution of classified sounds according to the application domain. Considering the different types of classified sounds, the specific sound environment, and the researcher’s objective for classifying the chosen sound, the classified sounds were categorized into three broad domains. They include Bioacoustics, Biomedical acoustics, and Ecoacoustics (see Figure 4). The application domain of bioacoustics was the most explored making up 50% of the study population. This domain consists of studies that classified sounds made from animals and human beings with the predominant aim of differentiating sounds and call types between and within animal species. Variations in animal sounds were also classified based on geographical locations. 11 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 On the other hand, the biomedical domain made up 24% of the study population and consists of studies that classified snore, heart, and lungs related diseases using sound. The goal of this domain was to provide an automated and efficient sound/acoustic signal classification system that will assist medical practitioners in smart diagnosis. Studies in this category also sought to eliminate the invasive traditional vision methodologies such as the use of medical imaging (Chen et al., 2019; Oweis et al., 2015; Vrbancic & Podgorelec, 2018). Equally, 26% of the studies explored sounds from the environment (ecoacoustics) to automatically recognize environmental acoustics scenes as well as to precisely classify the detected sound. This classification will enable the identification of sound events, environmental monitoring, and surveillance. The ecoacoustics domain consisted of sounds from sub-domains such as human activities, urban environment, surveillance, machinery, weather, and musical instruments. Figure 4. The distribution of application domains Sound Classification Algorithms and Performance Metrics An automatic classifier does not only identify or differentiate one sound from another, but it also reduces false detection of sounds (Binder & Paul, 2019). Thus, this section will provide a summary of the distribution of ML techniques and performance metrics used in the included studies for sound classification over the study years (i.e., between 2010 and 2019). Several ML techniques were identified from the studies and categorized as follows: i. Support Vector Machine (SVM) – SVM, Linear SVM, Radial Basis Function (RBF) SVM, MIML (multi-instance multi-label) SVM ii. Convolutional Neural Network (CNN) - CNN, Feedforward deep convolutional neural network, two-stream CNN (TSCNN-DS), CaffeNet pre-trained CNN, LeNet based CNN, SoundNet, EnvNet, multi-scale CNN (WaveMsNet), AlexNet, GoogleNet, and VGG16 iii. Artificial Neural Network (ANN) - Deep Neural Network (DNN), Multilayer perceptron (MLP), Self-organizing map, Deep residual networks (ResNets), Convolutional deep belief network (CDBN), Sparse Auto-Encoder (SAE), Self-organizing map-Spike Neural Network (SOM-SNN) iv. Long Short-Term Memory - Recurrent Neural Network (RNN), LSTM-RNN, and Long short- term memory-fully convolutional network (LSTM-FCN) v. Random forest (RF) 12 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 vi. K-Nearest neighbor (kNN) vii. Logistic Regression (LR) viii. Decision Tree (DT) ix. K-Means x. Ensemble Learners (EL) xi. Others - Sparse Representation-based Classifiers (SRC), Dynamic Time warping (DTW), Hidden Markov Model (HMM), Gaussian Mixture Models (GMM), aural classifiers, Non-Temporally Aware (NTA), Kernel-based extreme machine (KELM), Multi-view simple disagreement sampling (MV-SDS) Overall, SVM, CNN, and ANN were the three predominantly used ML techniques in sound classification. These 3 techniques put together were adopted by 62% of the included studies (see Figure 5). Figure 5 presents a summary of the amount of research interest that each ML technique has received during the past decade. Further, it highlights the distribution of research interest in ML techniques in each publication year. It is important to note that more than one ML technique was used in some studies. Compared to other identified ML techniques, SVM, CNN, and ANN have received dominant research interest over the years with at least one of these techniques used between 2010 and 2019, except in 2014 where Gaussian mixture models (GMM) was used. Support vector machine (SVM) has been identified as a robust technique in both classification and regression tasks. It is a supervised machine learning algorithm and it seeks to find the hyperplane which optimally separates the labeled data into their various classes (Bourouhou et al., 2019; Cvengros et al., 2012; Noda et al., 2016; Qian et al., 2017; Yaseen et al., 2018). Most of the articles that used SVM were focused on improving the classification performance either by modifying existing approaches of SVM-based classification or by adding new features to it. Modifications to existing approaches included Recursive feature elimination (SVM-RFE) and linear SVM (Cvengros et al., 2012), and SVM with linear kernels (Han et al., 2016), while added features included cost parameter CSVM (Malfante et al., 2018). Generally, SVMs have been reported to be cumbersome for multi-class tasks but robust for binary sound classification. Figure 5. Distribution of ML techniques over publication year 13 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Neural networks are algorithms that imitate the operations of a human brain to identify patterns and trends in data. Although its effectiveness is limited by the unavailability of labeled data, it is argued that they have self-organizing and adaptive learning properties with an outstanding ability to detect trends based on the sample data (Dwivedi et al., 2019). Accordingly, different types of Neural Networks in deep learning including CNN, ANN, LSTM were adopted by researchers in the included studies, and they made up 44% of the identified ML techniques. Further, the identified classification techniques and their distribution of use were categorized as follows: supervised ML technique (76%), unsupervised ML technique (5%), semi-supervised (1%), ensemble learning (3%), and the others (sequential classifiers and statistical modeling techniques) made up 15%. Furthermore, advanced learning techniques such as transfer learning and ensemble learners were adopted by some researchers to obtain a more robust sound classification model as well as overcome the challenges of limited data, overfitting, and lack of labeled data. CNN pre- trained models such as VGG16, VGG19 LeNet based CNN, SoundNet, EnvNet, multi-scale CNN (WaveMsNet), AlexNet, GoogleNet and CaffeNet were adopted for Transfer learning (Amiriparian et al., 2017; Boddapati et al., 2017; Bold et al., 2019; Pandeya & Lee, 2018; Zhao et al., 2018; Zhu et al., 2018). While an ensemble of stacked autoencoders (Ibrahim et al., 2019), and an ensemble of supervised, unsupervised, and semi-supervised learning techniques such as random forest, kNN, Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), and SVM-RBF (Humayun et al., 2018; Pandeya & Lee, 2018) using majority voting and unweighted average were adopted for ensemble learning. Additionally, a semi-supervised learning technique called active learning was used to minimize the demand for human descriptions on sound classification training models (Han et al., 2016). Figure 6. Distribution of classification techniques over application domain 14 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Furthermore, the ML techniques were analyzed with respect to the three application domains identified in this review. More specifically, the distribution of ML techniques as earlier categorized, was mapped to the domains of bioacoustics, biomedical acoustics, and ecoacoustics (see Figure 6). As shown in Figure 6, SVM and CNN were mostly used to classify medical and environmental sounds respectively. While ANN and other sequential classifiers and statistical models were mostly used to classify sounds from animals (bioacoustics). Generally, SVM, CNN, ANN, and other statistical models were predominantly used in the three domains. Figure 6 also shows that while all the identified classification techniques were used in bioacoustics, certain ML techniques were not used in the domains of biomedical acoustics and ecoacoustics. Specifically, random forest, K-means, and decision trees were not used in classifying medical sounds. Similarly, K-means and decision trees were also not used in the classification of sounds in the environment. Performance Metrics. An examination of the performance measures adopted by researchers to validate the reliability of their proposed ML techniques for sound classification is presented in this section. This includes evaluation measures such as cross-validation methods and classification metrics. Seven cross-validation methods were identified in the included studies for the primary purpose of evaluating model performance and computing classification accuracies. They include 10-fold cross-validation (Aucouturier et al., 2011; Han et al., 2016; Lebien & Ioup, 2018; Medhat et al., 2020; Pandeya et al., 2018; Salamon & Bello, 2015, 2017; Su et al., 2019), Leave-One-Out Cross- Validation (LOOCV) (Bourouhou et al., 2019; Colonna et al., 2016; Oweis et al., 2015; Parada & Cardenal-Lopez, 2014; Vahabi & Selviah, 2019), 2-fold cross-validation (Ibrahim et al., 2018; Noda et al., 2016), 1 for 4-fold (Zhang et al., 2019), 5-fold cross-validation (Boddapati et al., 2017; Briggs et al., 2012; Fang et al., 2019; Mun et al., 2017; Tschannen et al., 2016; Yaseen et al., 2018), 20-fold cross-validation (Kumar et al., 2010), and 10-fold stratified cross-validation (Gingras & Fitch, 2013; Nogueira et al., 2019). Other specific reasons for the adoption of the cross-validation techniques were to determine validation error rate and estimates of algorithm performance (Han et al., 2016; Lebien & Ioup, 2018; Vahabi & Selviah, 2019). Furthermore, classification metrics used to compare and evaluate the performance of the various ML and statistical techniques were identified. It is important to note that, more than one metric was used to evaluate the performance of a classification technique in most of the studies. Figure 7 provides an overview of the number and the proportion of reviewed studies using each performance metric. As shown in Figure 7, it is observed that accuracy is the predominantly used performance metric and it was adopted by 36% of the included studies. This is followed by Confusion Matrix (16%), Recall/ sensitivity (14%), and Specificity (10%). Precision, F1-score, AUC score, ROC curve, and UAR are other adopted metrics in the included studies with an equal distribution of 4%. True Positive Rate (TPR), False Positive Rate (FPR), G-mean, and mean error rate are the least used. Generally, it was observed that the classification techniques used in the included studies predominantly had good classification accuracies. 15 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Figure 7. Distribution of the studies over performance metrics dISCUSSIoNS The primary objective of the systematic review was to identify publication trends, methodological approaches, and current algorithms used in the automatic classification of sounds using ML techniques. This review was restricted to open-access conferences and journal articles published between 2010 and 2019. Based on a set of inclusion and exclusion criteria, the included 68 studies were selected from Scopus and ASA databases with conference and journal articles making up 29% and 71% of the study population respectively. This systematic review was guided by two categories of review questions which were answered accordingly. In the first category of this review, the publication trends between the years 2010 and 2019 were highlighted. It was observed that 60% of leading authors (that is authors with more than one publication) in sound classification predominantly focused on classifying sounds from animals, while the other 40% had an equal interest in classifying sounds in the environment and the biomedical domain. In addition, most of the studies originated from European and Asian countries (including the UK, USA, China, France, India, Korea, Portugal, Spain, and Germany) with a minimum of 3 publications and a maximum of 14 publications within the selected study years. In the second category of review results, 13 groups of classified sounds that cut across the three major sound sources (anthrophony, biophony, and geophony) were identified. Further, these sound groups were divided into three application domains namely Bioacoustics, Biomedical acoustics, and Ecoacoustics. It was observed that the bioacoustics domain attracted more research interest and researchers were mostly interested in classifying sounds from marine mammals. Yet, little attention was given to classifying sounds from the underwater environment even in studies that classified environmental sound. This is a research gap considering that 70% of the earth is covered with water and the temperature of the ocean determines climate and wind patterns which in turn affects life on land and the ecosystem (Domingo, 2012). On the other hand, studies in the biomedical domain primarily focused on diagnosing respiratory diseases using sound. Although the classified sounds cut across three major application domains, the list of unclassified sounds is inexhaustive. For instance, studies in the biomedical domain should be extended to classify sounds from other internal body organs (as an alternative to radiography) to diagnose a variety of medical conditions. Studies should also investigate the classification of extreme events such as tornadoes, hurricanes, drought, earthquakes using sound. This will enable early detection and warning systems for natural disasters. 16 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Review results also showed that a major research challenge reported by researchers was the unavailability of standardized labeled public datasets. This was particularly challenging for the biomedical domain, thus, researchers collected data (life recordings) from patients using Bluetooth stethoscope. Yet, the problem abounds because the collected data cannot be publicly available for future research. Perhaps, this could be a delimiting factor that dissuades researchers from delving into certain areas of sound classification. In the identification of feature extraction techniques, it was observed that, although a variety of feature extraction techniques were used, specific patterns in the use of these techniques to a particular application domain could not be established. However, it was observed that MFCCs were predominantly used in feature extraction due to their ability to imitate the hearing properties of the human ear. Furthermore, reported approaches for sound classification involved the use of both machine learning and non-machine learning techniques. Amongst the various identified classification techniques, support vector machines (SVM), convolutional neural networks (CNN), artificial neural networks (ANN), and other probabilistic statistical models were predominantly used in the domains of bioacoustics, biomedical acoustics, and ecoacoustics. The findings on the prevalence of ANN, CNN, and SVM in the classification of medical acoustics is similar to findings from the systematic review on ML in lung sound classification (Dwivedi et al., 2019; Palaniappan et al., 2013). Indeed, the predominant use of CNN for sound classification is no surprise considering that, most of the studies adopted an image-based approach for sound classification using spectrograms. Mitilineos et al., (2018) posit that neural network are adopted for sound classification due to their ability to identify specific patterns exhibited by sound sources using the distribution of energy over frequency and time. Also, machine learning techniques are outstandingly able to differentiate target acoustic signals/sound from an acoustic background (Shamir et al., 2014). Although neural networks reportedly require high computational power and large datasets, no study reported this as a limitation or a challenge. Overall, satisfactory results were reported for the various classification techniques as observed in the results of the performance metrics. Performance metrics such as cross-validation, classification accuracy, confusion matrix, recall, and precision were used to evaluate the performance of the classifiers. In cases of an unbalanced distribution of datasets, other performance metrics such as UAR, AUC curve, ROC curve were adopted. Finally, two types of acoustic signal classification schemes were identified, they included detection-and-classification otherwise known as acoustic event detection (AED), and detection-by- classification otherwise known as acoustic event classification (AEC). While the former involves detection of the sound and then its classification, the latter involves sound detection by classifying the audio segments. In detection-and-classification, no classification decision is made, rather segmentation is done when a segment boundary is detected based on a chosen threshold (Temko & Nadeu, 2009). Conversely, in detection-by-classification, the task of detection automatically translates to classification as its strategy is based on using classifiers (such as HMM, logistic regression) with inbuilt segmentation algorithms (Temko & Nadeu, 2009). As shown in Figure 8, 71% of the studies focused on AEC, while 29% adopted the AED approach. Also, detection and classification were performed in the domains of bioacoustics and environment only, while detection-by-classification cut across the three identified domains with bioacoustics as the most explored. 17 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Figure 8. Distribution of classification categories per application domain by year LIMITATIoN oF THE STUdy ANd CoNCLUSIoN This paper presented the findings of a systematic review of primary studies in the area of sound classification between the years 2010 and 2019. A major strength of this systematic review is that it was not specific to a particular sound, but it considered every kind of sound that cut across the domains of bioacoustics, ecoacoustics, and biomedical acoustics. It also identified two broad categories of sound classification schemes: acoustic event detection (AED) and acoustic event classification (AEC). Findings from the review indicated that automatic detection and classification systems were useful tools that could differentiate one acoustic event from the other, especially when deep learning techniques were used for the task. Although the reviews provided methodologies and algorithms used in various domains of sound classification, findings indicated that the methodologies and domains (in terms of scope) were not exhaustive. For instance, there was no study on the acoustic classification or detection of extreme events such as seismic and volcanic activities or the classification of medical conditions other than respiratory tract-related diseases. Also, the unavailability of publicly benchmarked datasets for sound classification in certain domains posed a challenge to the reproducibility of research approaches. Another hindrance to reproducibility is that model architectures and methods used for training datasets were not disclosed, especially in conference articles. Considering the relevance of reproducibility in scientific research, this research gap should be addressed in future studies. Generally, future studies should seek to address research challenges such as limited bandwidth, threshold problems, lack of general applicability of classifiers and publicly available datasets. Furthermore, this study acknowledged that the search strategy is not exhaustive: limiting the search to only open-accessed Scopus and ASA creates the possibility of omitting other relevant related studies. FUNdING AGENCy The publisher has waived the Open Access Processing fee for this article. 18 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 REFERENCES Allen, J. A., Murray, A., Noad, M. J., Dunlop, R. A., & Garland, E. C. (2017). Using self-organizing maps to classify humpback whale song units and quantify their similarity. The Journal of the Acoustical Society of America, 142(4), 1943–1952. doi:10.1121/1.4982040 PMID:29092588 Amiriparian, S., Gerczuk, M., Ottl, S., Cummins, N., Freitag, M., Pugachevskiy, S., Baird, A., & Schuller, B. (2017). Snore sound classification using image-based deep spectrum features. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2017-3512–3516. doi:10.21437/Interspeech.2017-434 Aucouturier, J.-J., Nonaka, Y., Katahira, K., & Okanoya, K. (2011). Segmentation of expiratory and inspiratory sounds in baby cry audio recordings using hidden Markov models. The Journal of the Acoustical Society of America, 130(5), 2969–2977. doi:10.1121/1.3641377 PMID:22087925 Binder, C., & Paul, H. (2019). Range-dependent impacts of ocean acoustic propagation on automated classification of transmitted bowhead and humpback whale vocalizations. The Journal of the Acoustical Society of America, 2480(4), 2480–2497. Advance online publication. doi:10.1121/1.5097593 PMID:31046335 Boddapati, V., Petef, A., Rasmusson, J., & Lundberg, L. (2017). Classifying environmental sounds using image recognition networks. Procedia Computer Science, 112, 2048–2056. doi:10.1016/j.procs.2017.08.250 Bold, N., Zhang, C., & Akashi, T. (2019). Cross-domain deep feature combination for bird species classification with audio-visual data. IEICE Transactions on Information and Systems, E102D(10), 2033–2042. 10.1587/ transinf.2018EDP7383 Bourouhou, A., Jilbab, A., Nacir, C., & Hammouch, A. (2019). Heart sounds classification for medical diagnostic assistance. International Journal of Online and Biomedical Engineering, 15(11), 88–103. doi:10.3991/ijoe. v15i11.10804 Briggs, F., Lakshminarayanan, B., Neal, L., Fern, X. Z., Raich, R., Hadley, S. J. K., Hadley, A. S., & Betts, M. G. (2012). Acoustic classification of multiple simultaneous bird species: A multi-instance multi-label approach. The Journal of the Acoustical Society of America, 131(6), 4640–4650. doi:10.1121/1.4707424 PMID:22712937 Chen, H., Yuan, X., Pei, Z., Li, M., & Li, J. (2019). Triple-Classification of Respiratory Sounds Using Optimized S-Transform and Deep Residual Networks. IEEE Access: Practical Innovations, Open Solutions, 7(April), 32845–32852. doi:10.1109/ACCESS.2019.2903859 Chu, S., Narayanan, S., & Kuo, C. C. J. (2009). Environmental sound recognition with time-frequency audio features. IEEE Transactions on Audio, Speech, and Language Processing, 17(6), 1142–1158. doi:10.1109/ TASL.2009.2017438 Colonna, J., Peet, T., Ferreira, C. A., Jorge, A. M., Gomes, E. F., & Gama, J. (2016). Automatic classification of anuran sounds using convolutional neural networks. ACM International Conference Proceeding Series, 73–78. doi:10.1145/2948992.2949016 Cvengros, R. M., Valente, D., Nykaza, E. T., & Vipperman, J. S. (2012). Blast noise classification with common sound level meter metrics. The Journal of the Acoustical Society of America, 132(2), 822–831. doi:10.1121/1.4730921 PMID:22894205 Davis, N., & Suresh, K. (2019). Environmental sound classification using deep convolutional neural networks and data augmentation. 2018 IEEE Recent Advances in Intelligent Computational Systems. RAICS, 2018, 41–45. doi:10.1109/RAICS.2018.8635051 Domingo, M. C. (2012). An overview of the internet of underwater things. Journal of Network and Computer Applications, 35(6), 1879–1890. doi:10.1016/j.jnca.2012.07.012 Dwivedi, A. K., Imtiaz, S. A., & Rodriguez-Villegas, E. (2019). Algorithms for automatic analysis and classification of heart sounds-A systematic review. IEEE Access: Practical Innovations, Open Solutions, 7(c), 8316–8345. doi:10.1109/ACCESS.2018.2889437 Elfergany, A. K., & Adl, A. (2020). Identification of Telecom Volatile Customers Using a Particle Swarm Optimized K-Means Clustering on Their Personality Traits Analysis. International Journal of Service Science, Management, Engineering, and Technology, 11(2), 1–15. doi:10.4018/IJSSMET.2020040101 19 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Fang, S. H., Te Wang, C., Chen, J. Y., Tsao, Y., & Lin, F. C. (2019). Combining acoustic signals and medical records to improve pathological voice classification. APSIPA Transactions on Signal and Information Processing, 8(1), 1–11. doi:10.1017/ATSIP.2019.7 Gingras, B., & Fitch, W. T. (2013). A three-parameter model for classifying anurans into four genera based on advertisement calls. The Journal of the Acoustical Society of America, 133(October), 547–559. Giret, N., Roy, P., Albert, A., Pachet, F., Kreutzer, M., & Bovet, D. (2011). Finding good acoustic features for parrot vocalizations: The feature generation approach. The Journal of the Acoustical Society of America, 129(2), 1089–1099. doi:10.1121/1.3531953 PMID:21361465 Greenhalgh, T. (1997). How to read a paper: Papers that summarise other papers (systematic reviews and meta- analyses). BMJ (Clinical Research Ed.), 315(7109), 672–675. doi:10.1136/bmj.315.7109.672 PMID:9310574 Guilment, T., Socheleau, F.-X., Pastor, D., & Vallez, S. (2018). Sparse representation-based classification of mysticete calls. The Journal of the Acoustical Society of America, 144(3), 1550–1563. doi:10.1121/1.5055209 PMID:30424647 Halkias, X., Paris, S., & Glotin, H. (2013). Classification of mysticete sounds using machine learning techniques. The Journal of the Acoustical Society of America, 134(5), 3496–3505. doi:10.1121/1.4821203 PMID:24180760 Han, W., Coutinho, E., Ruan, H., Li, H., Schuller, B., Yu, X., & Zhu, X. (2016). Semi-supervised active learning for sound classification in hybrid learning environments. PLoS One, 11(9), 1–19. doi:10.1371/journal. pone.0162075 PMID:27627768 Hao, Y., Weiss, G. M., & Brown, S. M. (2018). Identification of Candidate Genes Responsible for Age- related Macular Degeneration using Microarray Data. International Journal of Service Science, Management, Engineering, and Technology, 9(2), 33–60. doi:10.4018/IJSSMET.2018040102 Hlioui, F., Aloui, N., & Gargouri, F. (2020). Withdrawal Prediction Framework in Virtual Learning Environment. International Journal of Service Science, Management, Engineering, and Technology, 11(3), 47–64. doi:10.4018/ IJSSMET.2020070104 Humayun, A. I., Tauhiduzzaman Khan, M., Ghaffarzadegan, S., Feng, Z., & Hasan, T. (2018). An ensemble of transfer, semi-supervised and supervised learning methods for pathological heart sound classification. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 127–131. doi:10.21437/Interspeech.2018-2413 Ibrahim, A. K., Chérubin, L. M., Zhuang, H., Schärer Umpierre, M. T., Dalgleish, F., Erdol, N., Ouyang, B., & Dalgleish, A. (2018). An approach for automatic classification of grouper vocalizations with passive acoustic monitoring. The Journal of the Acoustical Society of America, 143(2), 666–676. doi:10.1121/1.5022281 PMID:29495690 Ibrahim, A. K., Zhuang, H., Chérubin, L. M., Schärer-Umpierre, M. T., & Erdol, N. (2018). Automatic classification of grouper species by their sounds using deep neural networks. The Journal of the Acoustical Society of America, 144(3), EL196–EL202. doi:10.1121/1.5054911 PMID:30424627 Ibrahim, A. K., Zhuang, H., Chérubin, L. M., Umpierre, M. T. S., Ali, A. M., Richard, S., Sch, M. T., Ali, A. M., Nemeth, R. S., & Erdol, N. (2019). Classification of red hind grouper call types using random ensemble of stacked autoencoders. The Journal of the Acoustical Society of America, 146(4), 2155–2162. doi:10.1121/1.5126861 PMID:31671953 Kaewtip, K., Alwan, A., O’Reilly, C., & Taylor, C. E. (2016). A robust automatic birdsong phrase classification: A template-based approach. The Journal of the Acoustical Society of America, 140(5), 3691–3701. doi:10.1121/1.4966592 PMID:27908084 Karbasi, M., Ahadi, S. M., & Bahmanian, M. (2011). Environmental sound classification using spectral dynamic features. ICICS 2011 - 8th International Conference on Information, Communications and Signal Processing, 2–7. doi:10.1109/ICICS.2011.6173513 Kumar, D., Carvalho, P., Antunes, M., Paiva, R. P., & Henriques, J. (2010). Heart murmur classification with feature selection. 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC’10, June 2014, 4566–4569. doi:10.1109/IEMBS.2010.5625940 20 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Kumar, D., Carvalho, P., Couceiro, R., Antunes, M., Paiva, R. P., & Henriques, J. (2010). Heart murmur classification using complexity signatures. Proceedings - International Conference on Pattern Recognition, 2564–2567. doi:10.1109/ICPR.2010.628 Lebien, J., & Ioup, J. (2018). Species-level classification of beaked whale echolocation signals detected in the northern Gulf of Mexico. The Journal of the Acoustical Society of America, 144(1), 387–396. doi:10.1121/1.5047435 PMID:30075691 Loey, M., ElSawy, A., & Afify, M. (2020). Deep Learning in Plant Diseases Detection for Agricultural Crops: A Survey. International Journal of Service Science, Management, Engineering, and Technology, 11(2), 41–58. doi:10.4018/IJSSMET.2020040103 Loey, M., Naman, M. R., & Zayed, H. H. (2020). A Survey on Blood Image Diseases Detection Using Deep Learning. International Journal of Service Science, Management, Engineering, and Technology, 11(3), 18–32. doi:10.4018/IJSSMET.2020070102 Luque, A., Romero-Lemos, J., Carrasco, A., & Gonzalez-Abril, L. (2018). Temporally-aware algorithms for the classification of anuran sounds. PeerJ, 6(e4732), 1–40. doi:10.7717/peerj.4732 PMID:29740517 Malfante, M., Mars, J. I., Dalla Mura, M., & Gervaise, C. (2018). Automatic fish sounds classification. The Journal of the Acoustical Society of America, 143(5), 2834–2846. doi:10.1121/1.5036628 PMID:29857733 Medhat, F., Chesmore, D., & Robinson, J. (2020). Masked Conditional Neural Networks for sound classification. Applied Soft Computing, 90(608014), 1–13. doi:10.1016/j.asoc.2020.106073 Mitilineos, S. A., Potirakis, S. M., Tatlas, N. A., & Rangoussi, M. (2018). A two-level sound classification platform for environmental monitoring. Journal of Sensors, 2018(5828074), 1–13. doi:10.1155/2018/5828074 Mun, S., Shon, S., Kim, W., Han, D. K., & Ko, H. (2017). A novel discriminative feature extraction for acoustic scene classification using RNN based source separation. IEICE Transactions on Information and Systems, E100D(12), 3041–3044. 10.1587/transinf.2017EDL8132 Noda, J. J., Travieso, C. M., & Sánchez-Rodríguez, D. (2016). Automatic taxonomic classification of fish based on their acoustic signals. Applied Sciences (Switzerland), 6(12), 443. Advance online publication. doi:10.3390/ app6120443 Nogueira, D. M., Ferreira, C. A., Gomes, E. F., & Jorge, A. M. (2019). Classifying Heart Sounds Using Images of Motifs, MFC,C and Temporal Features. Journal of Medical Systems, 43(6), 186–203. doi:10.1007/s10916- 019-1286-5 PMID:31056720 Oikarinen, T., Srinivasan, K., Meisner, O., Hyman, J. B., Parmar, S., Fanucci-Kiss, A., Desimone, R., Landman, R., & Feng, G. (2019). Deep convolutional network for animal sound classification and source attribution using dual audio recordings. The Journal of the Acoustical Society of America, 145(2), 654–662. doi:10.1121/1.5087827 PMID:30823820 Oletic, D., Arsenali, B., & Bilas, V. (2012). Towards continuous wheeze detection body sensor node as a core of asthma monitoring system. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, 83 LNICST, 165–172. 10.1007/978-3-642-29734-2_23 Ou, H., Au, W., Zurk, L., & Lammers, M. (2013). Automated extraction and classification of time-frequency contours in humpback vocalizations. The Journal of the Acoustical Society of America, 133(1), 301–310. doi:10.1121/1.4770251 PMID:23297903 Oweis, R. J., Abdulhay, E. W., Khayal, A., & Awad, A. (2015). An alternative respiratory sounds classification system utilizing artificial neural networks. Biomedical Journal, 38(2), 153–161. doi:10.4103/2319-4170.137773 PMID:25179722 Palaniappan, R., Sundaraj, K., & Ahamed, N. U. (2013). Machine learning in lung sound analysis: A systematic review. Biocybernetics and Biomedical Engineering, 33(3), 129–135. doi:10.1016/j.bbe.2013.07.001 Pandeya, Y. R., Kim, D., & Lee, J. (2018). Domestic cat sound classification using learned features from deep neural nets. Applied Sciences (Switzerland), 8(10), 1–17. doi:10.3390/app8101949 Pandeya, Y. R., & Lee, J. (2018). Domestic cat sound classification using transfer learning. International Journal of Fuzzy Logic and Intelligent Systems, 18(2), 154–160. doi:10.5391/IJFIS.2018.18.2.154 21 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Parada, P. P., & Cardenal-Lopez, A. (2014). Using Gaussian mixture models to detect and classify dolphin whistles and pulses. The Journal of the Acoustical Society of America, 135(6), 3371–3381. doi:10.1121/1.4876439 PMID:24907800 Perr, J. (2005). Basic acoustics and Signal Processing. LinuxFocus.Org, 1(271), 1–22. http://linuxfocus.org Pramono, R. X. A., Bowyer, S., & Rodriguez-Villegas, E. (2017). Automatic adventitious respiratory sound analysis: A systematic review. PLoS One, 12(5), e0177926. Advance online publication. doi:10.1371/journal. pone.0177926 PMID:28552969 Qian, K., Zhang, Z., Baird, A., & Schuller, B. (2017). Active learning for bird sound classification via a kernel-based extreme learning machine. The Journal of the Acoustical Society of America, 142(4), 1796–1804. doi:10.1121/1.5004570 PMID:29092546 Roch, M. A., Klinck, H., Baumann-Pickering, S., Mellinger, D. K., Qui, S., Soldevilla, M. S., & Hildebrand, J. A. (2011). Classification of echolocation clicks from odontocetes in the Southern California Bight. The Journal of the Acoustical Society of America, 129(1), 467–475. doi:10.1121/1.3514383 PMID:21303026 Salama, M. A., & Hassanien, A. E. (2014). Fuzzification of Euclidean Space Approach in Machine Learning Techniques. International Journal of Service Science, Management, Engineering, and Technology, 5(4), 29–43. doi:10.4018/ijssmet.2014100103 Salamon, J., & Bello, J. P. (2015). Unsupervised Feature Learning for Urban Sound Classification. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 171–175. Salamon, J., & Bello, J. P. (2017). Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification. IEEE Signal Processing Letters, 24(3), 279–283. doi:10.1109/LSP.2017.2657381 Sangwan, N., & Bhatnagar, V. (2020). Comprehensive Contemplation of Probabilistic Aspects in Intelligent Analytics. International Journal of Service Science, Management, Engineering, and Technology, 11(1), 116–141. doi:10.4018/IJSSMET.2020010108 Sengupta, N., Sahidullah, M., & Saha, G. (2016). Lung sound classification using cepstral-based statistical features. Computers in Biology and Medicine, 75, 118–129. doi:10.1016/j.compbiomed.2016.05.013 PMID:27286184 Shamir, L., Yerby, C., Simpson, R., von Benda-Beckmann, A. M., Tyack, P., Samarra, F., Miller, P., & Wallin, J. (2014). Classification of large acoustic datasets using machine learning and crowdsourcing: Application to whale calls. The Journal of the Acoustical Society of America, 135(2), 953–962. doi:10.1121/1.4861348 PMID:25234903 Su, Y., Zhang, K., Wang, J., & Madani, K. (2019). Environment sound classification using a two-stream CNN based on decision-level fusion. Sensors (Switzerland), 19(7), 1–15. doi:10.3390/s19071733 PMID:30978974 Tan, L. N., Alwan, A., Kossan, G., Cody, M. L., & Taylor, C. E. (2015). Dynamic time warping and sparse representation classification for birdsong phrase classification using limited training data a). The Journal of the Acoustical Society of America, 137(3), 1069–1080. Advance online publication. doi:10.1121/1.4906168 PMID:25786922 Tatoian, R., & Hamel, L. (2018). Self-organizing map convergence. International Journal of Service Science, Management, Engineering, and Technology, 9(2), 61–84. doi:10.4018/IJSSMET.2018040103 Temko, A., Nadeu, C., Macho, D., Malkin, R., Zieger, C., & Omologo, M. (2009). Acoustic Event Detection and Classification. Computers in the Human Interaction Loop, (December), 61–73. doi:10.1007/978-1-84882-054-8_7 Thakur, A., Thapar, D., Rajan, P., & Nigam, A. (2019). Deep metric learning for bioacoustic classification: Overcoming training data scarcity using dynamic triplet loss. The Journal of the Acoustical Society of America, 146(1), 534–547. doi:10.1121/1.5118245 PMID:31370640 Tschannen, M., Kramer, T., Marti, G., Heinzmann, M., & Wiatowski, T. (2016). Heart sound classification using deep structured features. Computers in Cardiology, 43, 565–568. doi:10.22489/CinC.2016.162-186 Vahabi, N., & Selviah, D. R. (2019). Convolutional Neural Networks to Classify Oil, Wat,er and Gas Wells Fluid Using Acoustic Signals. 2019 IEEE 19th International Symposium on Signal Processing and Information Technology, ISSPIT 2019. doi:10.1109/ISSPIT47144.2019.9001845 22 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Vrbancic, G., & Podgorelec, V. (2018). Automatic classification of motor impairment neural disorders from EEG signals using deep convolutional neural networks. Elektronika ir Elektrotechnika, 24(4), 1–7. doi:10.5755/ j01.eie.24.4.21469 Wren, Y., Harding, S., Goldbart, J., & Roulstone, S. (2018). A systematic review and classification of interventions for speech-sound disorder in preschool children. International Journal of Language & Communication Disorders, 53(3), 446–467. doi:10.1111/1460-6984.12371 PMID:29341346 Yaseen, S., Son, G.-Y., & Kwon, S. (2018). Classification of heart sound signal using multiple features. Applied Sciences (Basel, Switzerland), 8(12), 1–14. doi:10.3390/app8122344 Zhang, Y., Lv, D., & Zhao, Y. (2016). Multiple-view active learning for environmental sound classification. International Journal of Online Engineering, 12(12), 49–54. doi:10.3991/ijoe.v12i12.6458 Zhang, Y.-J., Huang, J.-F., Gong, N., Ling, Z.-H., & Hu, Y. (2018). Automatic detection and classification of marmoset vocalizations using deep and recurrent neural networks. The Journal of the Acoustical Society of America, 144(1), 478–487. doi:10.1121/1.5047743 PMID:30075670 Zhao, H., Huang, X., Liu, W., & Yang, L. (2018). Environmental sound classification based on feature fusion. MATEC Web of Conferences, 173, 1–5. doi:10.1051/matecconf/201817303059 Zhu, B., Wang, C., Liu, F., Lei, J., Lu, Z., & Peng, Y. (2018). Learning Environmental Sounds with with Multi- scale Convolutional Neural Network. Proceedings of the International Joint Conference on Neural Networks (IJCNN), 1–8. doi:10.1109/IJCNN.2018.8489641 23 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 APPENdIX A - PRIMARy STUdIES USEd FoR THE SySTEMATIC REVIEw Table 4. Primary studies Ref no. Bibliography      A1. Shamir, L., Yerby, C., Simpson, R., von Benda-Beckmann, A. M., Tyack, P., Samarra, F., Miller, P., & Wallin, J. (2014). Classification of large acoustic datasets using machine learning and crowdsourcing: Application to whale calls. The Journal of the Acoustical Society of America, 135(2), 953–962. https://doi.org/10.1121/1.4861348      A2. Qian, K., Zhang, Z., Baird, A., & Schuller, B. (2017). Active learning for bird sound classification via a kernel-based extreme learning machine. The Journal of the Acoustical Society of America, 142(4), 1796–1804. https://doi.org/10.1121/1.5004570      A3. Malfante, M., Mars, J. I., Dalla Mura, M., & Gervaise, C. (2018). Automatic fish sounds classification. The Journal of the Acoustical Society of America, 143(5), 2834–2846. https://doi. org/10.1121/1.5036628      A4. Halkias, X. C., Paris, S., & Glotin, H. (2013). Classification of mysticete sounds using machine learning techniques. The Journal of the Acoustical Society of America, 134(5), 3496–3505. https:// doi.org/10.1121/1.4821203      A5. Thakur, A., Thapar, D., Rajan, P., & Nigam, A. (2019). Deep metric learning for bioacoustic classification: Overcoming training data scarcity using dynamic triplet loss. The Journal of the Acoustical Society of America, 146(1), 534–547. https://doi.org/10.1121/1.5118245      A6. Cvengros, R. M., Valente, D., Nykaza, E. T., & Vipperman, J. S. (2012a). Blast noise classification with common sound level meter metrics. The Journal of the Acoustical Society of America, 132(2), 822–831. https://doi.org/10.1121/1.4730921      A7. Briggs, F., Lakshminarayanan, B., Neal, L., Fern, X. Z., Raich, R., Hadley, S. J. K., Hadley, A. S., & Betts, M. G. (2012). Acoustic classification of multiple simultaneous bird species: A multi-instance multi-label approach. The Journal of the Acoustical Society of America, 131(6), 4640–4650. https://doi.org/10.1121/1.4707424      A8. Robakis, E., Watsa, M., & Erkenswick, G. (2018). Classification of producer characteristics in primate long calls using neural networks. The Journal of the Acoustical Society of America, 144(1), 344–353. https://doi.org/10.1121/1.5046526      A9. Ibrahim, A. K., Chérubin, L. M., Zhuang, H., Schärer Umpierre, M. T., Dalgleish, F., Erdol, N., Ouyang, B., & Dalgleish, A. (2018). An approach for automatic classification of grouper vocalizations with passive acoustic monitoring. The Journal of the Acoustical Society of America, 143(2), 666–676. https://doi.org/10.1121/1.5022281      A10. Zhang, Y.-J., Huang, J.-F., Gong, N., Ling, Z.-H., & Hu, Y. (2018). Automatic detection and classification of marmoset vocalizations using deep and recurrent neural networks. The Journal of the Acoustical Society of America, 144(1), 478–487. https://doi.org/10.1121/1.5047743      A11. Oikarinen, T., Srinivasan, K., Meisner, O., Hyman, J. B., Parmar, S., Fanucci-Kiss, A., Desimone, R., Landman, R., & Feng, G. (2019). Deep convolutional network for animal sound classification and source attribution using dual audio recordings. The Journal of the Acoustical Society of America, 145(2), 654–662. https://doi.org/10.1121/1.5087827      A12. Ibrahim, A. K., Zhuang, H., Chérubin, L. M., Umpierre, M. T. S., Ali, A. M., Richard, S., Sch, M. T., Ali, A. M., Nemeth, R. S., & Erdol, N. (2019). Classification of red hind grouper call types using a random ensemble of stacked autoencoders. 2155. https://doi.org/10.1121/1.5126861      A13. Guilment, T., Socheleau, F.-X., Pastor, D., & Vallez, S. (2018). Sparse representation-based classification of mysticete calls. The Journal of the Acoustical Society of America, 144(3), 1550–1563. https://doi.org/10.1121/1.5055209      A14. Kaewtip, K., Alwan, A., O’Reilly, C., & Taylor, C. E. (2016). A robust automatic birdsong phrase classification: A template-based approach. The Journal of the Acoustical Society of America, 140(5), 3691–3701. https://doi.org/10.1121/1.4966592 Table 4 continued on next page 24 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Table 4 continued Ref no. Bibliography      A15. Binder, C., & Paul, H. (2019). Range-dependent impacts of ocean acoustic propagation on automated classification of transmitted bowhead and humpback whale vocalizations. 2480. https:// doi.org/10.1121/1.5097593      A16. Roch, M. A., Newport, D., Baumann-pickering, S., Mellinger, D. K., Qui, S., Soldevilla, M. S., & Hildebrand, J. A. (2011). Classification of echolocation clicks from odontocetes in the Southern California Bight. The Journal of the Acoustical Society of America, 129(January), 467–476. https://doi.org/10.1121/1.3514383      A17. Allen, J. A., Murray, A., Noad, M. J., Dunlop, R. A., & Garland, E. C. (2017). Using self- organizing maps to classify humpback whale song units and quantify their similarity. The Journal of the Acoustical Society of America, 142(4), 1943–1952. https://doi.org/10.1121/1.4982040      A18. Tan, L. N., Alwan, A., Kossan, G., Cody, M. L., & Taylor, C. E. (2015). Dynamic time warping and sparse representation classification for birdsong phrase classification using limited training data a). 137(3). https://doi.org/10.1121/1.4906168      A19. Ou, H., Au, W., Zurk, L., & Lammers, M. (2013). Automated extraction and classification of time- frequency contours in humpback vocalizations. 133(January).      A20. LeBien, J. G., & Ioup, J. W. (2018). Species-level classification of beaked whale echolocation signals detected in the northern Gulf of Mexico. The Journal of the Acoustical Society of America, 144(1), 387–396. https://doi.org/10.1121/1.5047435      A21. Giret, N., Roy, P., Albert, A., Pachet, F., Kreutzer, M., & Bovet, D. (2011). Finding good acoustic features for parrot vocalizations: The feature generation approach. The Journal of the Acoustical Society of America, 129(2), 1089–1099.      A22. Parada, P. P., & Cardenal-Lopez, A. (2014). Using Gaussian mixture models to detect and classify dolphin whistles and pulses. The Journal of the Acoustical Society of America, 135(June), 3371–3381. https://dx.doi.org/10.1121/1.4876439      A23. Gingras, B., & Fitch, W. T. (2013). A three-parameter model for classifying anurans into four genera based on advertisement calls. 133(October 2012), 547–559.      A24. Aucouturier, J.-J., Nonaka, Y., Katahira, K., & Okanoya, K. (2011). Segmentation of expiratory and inspiratory sounds in baby cry audio recordings using hidden Markov models. The Journal of the Acoustical Society of America, 130(5), 2969–2977. https://doi.org/10.1121/1.3641377      A25. Bishop, J. C., Falzon, G., Trotter, M., Kwan, P., & Meek, P. D. (2019). Livestock vocalization classification in farm soundscapes. Computers and Electronics in Agriculture, 162(April), 531–542. https://doi.org/10.1016/j.compag.2019.04.020      A26. Aziz, S., Awais, M., Akram, T., Khan, U., Alhussein, M., & Aurangzeb, K. (2019). Automatic scene recognition through acoustic classification for behavioral robotics. Electronics (Switzerland), 8(5). https://doi.org/10.3390/electronics8050483      A27. Chen, H., Yuan, X., Pei, Z., Li, M., & Li, J. (2019). Triple-Classification of Respiratory Sounds Using Optimized S-Transform and Deep Residual Networks. IEEE Access, 7(April), 32845– 32852. https://doi.org/10.1109/ACCESS.2019.2903859      A28. Bourouhou, A., Jilbab, A., Nacir, C., & Hammouch, A. (2019). Heart sounds classification for medical diagnostic assistance. International Journal of Online and Biomedical Engineering, 15(11), 88–103. https://doi.org/10.3991/ijoe.v15i11.10804      A29. Yaseen, Son, G. Y., & Kwon, S. (2018). Classification of heart sound signal using multiple features. Applied Sciences (Switzerland), 8(12). https://doi.org/10.3390/app8122344      A30. Pandeya, Y. R., Kim, D., & Lee, J. (2018). Domestic cat sound classification using learned features from deep neural nets. Applied Sciences (Switzerland), 8(10), 1–17. https://doi. org/10.3390/app8101949      A31. Luque, A., Romero-Lemos, J., Carrasco, A., & Barbancho, J. (2018). Non-sequential automatic classification of anuran sounds for the estimation of climate-change indicators. Expert Systems with Applications, 95, 248–260. https://doi.org/10.1016/j.eswa.2017.11.016 Table 4 continued on next page 25 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Table 4 continued Ref no. Bibliography      A32. Kim, Y., Sa, J., Chung, Y., Park, D., & Lee, S. (2018). Resource-efficient pet dog sound events classification using LSTM-FCN based on time-series data. Sensors (Switzerland), 18(11). https:// doi.org/10.3390/s18114019      A33. Luque, A., Romero-Lemos, J., Carrasco, A., & Gonzalez-Abril, L. (2018). Temporally aware algorithms for the classification of anuran sounds. PeerJ, 2018(5), 1–40. https://doi.org/10.7717/ peerj.4732      A34. Aykanat, M., Kılıç, Ö., Kurt, B., & Saryal, S. (2017). Classification of lung sounds using convolutional neural networks. Eurasip Journal on Image and Video Processing, 2017(1). https:// doi.org/10.1186/s13640-017-0213-2      A35. Zhang, Yan, Lv, D., & Zhao, Y. (2016). Multiple-view active learning for environmental sound classification. International Journal of Online Engineering, 12(12), 49–54. https://doi. org/10.3991/ijoe.v12i12.6458      A36. Han, W., Coutinho, E., Ruan, H., Li, H., Schuller, B., Yu, X., & Zhu, X. (2016). Semi-supervised active learning for sound classification in hybrid learning environments. PLoS ONE, 11(9), 1–19. https://doi.org/10.1371/journal.pone.0162075      A37. Noda, J. J., Travieso, C. M., & Sánchez-Rodríguez, D. (2016). Automatic taxonomic classification of fish based on their acoustic signals. Applied Sciences (Switzerland), 6(12). https://doi. org/10.3390/app6120443      A38. Raza, A., Mehmood, A., Ullah, S., Ahmad, M., Choi, G. S., & On, B. W. (2019). Heartbeat sound signal classification using deep learning. Sensors (Switzerland), 19(21), 1–15. https://doi. org/10.3390/s19214819      A39. Su, Y., Zhang, K., Wang, J., & Madani, K. (2019). Environment sound classification using a two-stream CNN based on decision-level fusion. Sensors (Switzerland), 19(7), 1–15. https://doi. org/10.3390/s19071733      A40. Khamparia, A., Gupta, D., Nguyen, N. G., Khanna, A., Pandey, B., & Tiwari, P. (2019). Sound classification using convolutional neural network and tensor deep stacking network. IEEE Access, 7(January), 7717–7727. https://doi.org/10.1109/ACCESS.2018.2888882      A41. Bold, N., Zhang, C., & Akashi, T. (2019). Cross-domain deep feature combination for bird species classification with audio-visual data. IEICE Transactions on Information and Systems, E102D(10), 2033–2042. https://doi.org/10.1587/transinf.2018EDP7383      A42. Verma, D., Jana, A., & Ramamritham, K. (2019). Classification and mapping of sound sources in local urban streets through AudioSet data and Bayesian optimized Neural Networks. Noise Mapping, 6(1), 52–71. https://doi.org/10.1515/noise-2019-0005      A43. Wu, J., Chua, Y., Zhang, M., Li, H., & Tan, K. C. (2018). A spiking neural network framework for robust sound classification. Frontiers in Neuroscience, 12(NOV), 1–17. https://doi.org/10.3389/ fnins.2018.00836      A44. Pandeya, Y. R., & Lee, J. (2018). Domestic cat sound classification using transfer learning. International Journal of Fuzzy Logic and Intelligent Systems, 18(2), 154–160. https://doi. org/10.5391/IJFIS.2018.18.2.154      A45. Vrbancic, G., & Podgorelec, V. (2018). Automatic classification of motor impairment neural disorders from EEG signals using deep convolutional neural networks. Elektronika Ir Elektrotechnika, 24(4), 1–7. https://doi.org/10.5755/j01.eie.24.4.21469      A46. Salamon, J., & Bello, J. P. (2017). Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification. IEEE Signal Processing Letters, 24(3), 279–283. https:// doi.org/10.1109/LSP.2017.2657381 Table 4 continued on next page 26 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Table 4 continued Ref no. Bibliography      A47. Oweis, R. J., Abdulhay, E. W., Khayal, A., & Awad, A. (2015). An alternative respiratory sounds classification system utilizing artificial neural networks. Biomedical Journal, 38(2), 153–161. https://doi.org/10.4103/2319-4170.137773      A48. Fang, S. H., Wang, C. Te, Chen, J. Y., Tsao, Y., & Lin, F. C. (2019). Combining acoustic signals and medical records to improve pathological voice classification. APSIPA Transactions on Signal and Information Processing, 8(2019), 1–11. https://doi.org/10.1017/ATSIP.2019.7      A49. Wang, W., Meratnia, N., Seraj, F., & Havinga, P. J. M. (2019). Privacy-aware environmental sound classification for indoor human activity recognition. ACM International Conference Proceeding Series, 36–44. https://doi.org/10.1145/3316782.3321521      A50. Kroos, C., Bones, O., Cao, Y., Harris, L., Jackson, P. J. B., Davies, W. J., Wang, W., Cox, T. J., & Plumbley, M. D. (2019). Generalization in Environmental Sound Classification: The “Making Sense of Sounds” Data Set and Challenge. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2019-May, 8082–8086. https://doi.org/10.1109/ ICASSP.2019.8683292      A51. Humayun, A. I., Tauhiduzzaman Khan, M., Ghaffarzadegan, S., Feng, Z., & Hasan, T. (2018). An ensemble of transfer, semi-supervised and supervised learning methods for pathological heart sound classification. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2018-September(i), 127–131. https://doi. org/10.21437/Interspeech.2018-2413      A52. Colonna, J., Peet, T., Ferreira, C. A., Jorge, A. M., Gomes, E. F., & Gama, J. (2016). Automatic classification of anuran sounds using convolutional neural networks. ACM International Conference Proceeding Series, 20-22-July-2016, 73–78. https://doi.org/10.1145/2948992.2949016      A53. Tschannen, M., Kramer, T., Marti, G., Heinzmann, M., & Wiatowski, T. (2016). Heart sound classification using deep structured features. Computing in Cardiology, 43, 565–568. https://doi. org/10.22489/cinc.2016.162-186      A54. Yang, X., Yang, F., Gobeawan, L., Yeo, S. Y., Leng, S., Zhong, L., & Su, Y. (2016). A multi- modal classifier for heart sound recordings. Computing in Cardiology, 43, 1165–1168. https://doi. org/10.22489/cinc.2016.339-225      A55. Salamon, J., & Bello, J. P. (2015). Unsupervised Feature Learning for Urban Sound Classification. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 171–175.      A56. Kocuvan, P., & Torkar, D. (2015). Classification of the heart auscultation signals. HEALTHINF 2015 - 8th International Conference on Health Informatics, Proceedings; Part of 8th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2015, 534–539. https://doi.org/10.5220/0005264005340539      A57. Silva, P. (2012). Classification, segmentation, and chronological prediction of cinematic sound. Proceedings - 2012 11th International Conference on Machine Learning and Applications, ICMLA 2012, 2, 369–374. https://doi.org/10.1109/ICMLA.2012.172      A58. Kumar, D., Carvalho, P., Couceiro, R., Antunes, M., Paiva, R. P., & Henriques, J. (2010). Heart murmur classification using complexity signatures. Proceedings - International Conference on Pattern Recognition, 2564–2567. https://doi.org/10.1109/ICPR.2010.628      A59. Zhu, B., Wang, C., Liu, F., Lei, J., Lu, Z., & Peng, Y. (2018). Learning Environmental Sounds with Multi-scale Convolutional Neural Network. Proceedings of the International Joint Conference on Neural Networks (IJCNN), 1–8. https://doi.org/10.1109/IJCNN.2018.848964      A60. Zhao, H., Huang, X., Liu, W., & Yang, L. (2018). Environmental sound classification based on feature fusion. MATEC Web of Conferences, 173, 1–5. https://doi.org/10.1051/ matecconf/201817303059 Table 4 continued on next page 27 International Journal of Service Science, Management, Engineering, and Technology Volume 13 • Issue 1 Table 4 continued Ref no. Bibliography      A61. Hu, W., Lv, J., Liu, D., & Chen, Y. (2018). Unsupervised Feature Learning for Heart Sounds Classification Using Autoencoder. Journal of Physics: Conference Series, 1004(1). https://doi. org/10.1088/1742-6596/1004/1/012002      A62. Bisot, V., Serizel, R., Essid, S., & Richard, G. (2017). Leveraging deep neural networks with nonnegative representations for improved environmental sound classification. IEEE International Workshop on Machine Learning for Signal Processing, MLSP, 2017-September, 1–6. https://doi. org/10.1109/MLSP.2017.8168139      A63. Amiriparian, S., Gerczuk, M., Ottl, S., Cummins, N., Freitag, M., Pugachevskiy, S., Baird, A., & Schuller, B. (2017). Snore sound classification using image-based deep spectrum features. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2017-August, 3512–3516. https://doi.org/10.21437/Interspeech.2017-434      A64. Boddapati, V., Petef, A., Rasmusson, J., & Lundberg, L. (2017). Classifying environmental sounds using image recognition networks. Procedia Computer Science, 112, 2048–2056. https://doi. org/10.1016/j.procs.2017.08.250      A65. Medhat, F., Chesmore, D., & Robinson, J. (2020). Masked Conditional Neural Networks for sound classification. Applied Soft Computing Journal, 90(608014), 1–13. https://doi.org/10.1016/j. asoc.2020.106073      A66. Nogueira, D. M., Ferreira, C. A., Gomes, E. F., & Jorge, A. M. (2019). Classifying Heart Sounds Using Images of Motifs, MFCC, and Temporal Features. Journal of Medical Systems, 43(6), 186–203. https://doi.org/10.1007/s10916-019-1286-5      A67. Kumar, D., Carvalho, P., Antunes, M., Paiva, R. P., & Henriques, J. (2010). Heart murmur classification with feature selection. 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC’10, June 2014, 4566–4569. https://doi.org/10.1109/IEMBS.2010.5625940      A68. Vahabi, N., & Selviah, D. R. (2019). Convolutional Neural Networks to Classify Oil, Water, and Gas Wells Fluid Using Acoustic Signals. 2019 IEEE 19th International Symposium on Signal Processing and Information Technology, ISSPIT 2019. https://doi.org/10.1109/ISSPIT47144.2019.9001845 Akon O. Ekpezu is a Lecturer in the Department of Computer Science, Cross River University of Technology (CRUTECH), Nigeria. She holds; a Bachelor of Science (B.Sc) in Mathematics and Statistics from the University of Calabar, Nigeria, a Post Graduate Diploma (PGD) in Computer Science from the same university, a Master of Science (M.Sc.) in Information Technology from the National Open University of Nigeria (NOUN) and a Master of Philosophy (MPhil) in Computer Science from the University of Ghana. She is currently pursuing a PhD in Information Processing Science, University of Oulu, Finland. She is interested in the following areas of research; Persuasive Systems, Behavior Change Support Systems, Machine Learning and Information Security. Winfred Yaokumah is a researcher, cyber security expert and senior faculty at the Department of Computer Science of the University of Ghana. His work appears in several reputable journals including Information and Computer Security, International Journal of Distributed Artificial Intelligence, Journal of Information Technology Research, Information Resources Management Journal, IEEE Xplore, International Journal of e-Business Research, and International Journal of Enterprise Information Systems. He is an editor of the Modern Theories and Practices for Cyber Ethics and Security Compliance. His research interest includes Cyber Security, Machine Learning, Network Security, and Information Systems Security. He also serves on an International Review Board for the International Journal of Technology Diffusion. 28