Implementation of Artiﬁcial Intelligence for Classiﬁcation of Frogs in Bioacoustics

: This research presents the implementation of artiﬁcial intelligence (AI) for classiﬁcation of frogs in symmetry of the bioacoustics spectral by using the feedforward neural network approach (FNNA) and support vector machine (SVM). Recently, the symmetry concept has been applied in physics, and in mathematics to help make mathematical models tractable to achieve the best learning performance. Owing to the symmetry of the bioacoustics spectral, feature extraction can be achieved by integrating the techniques of Mel-scale frequency cepstral coe ﬃ cient (MFCC) and mentioned machine learning algorithms, such as SVM, neural network


Introduction
The lives on Earth are closely relevant to environmental changes on various spatial and temporal scales [1]. The success of human societies intimately relies on the living elements of natural and managed systems. Even though geographical range limits of species are naturally changing with time-variants, climate change is gradually impelling a universal redistribution of lives on Earth [2,3].
To know more about the life distribution or migration of animals, related prospective detection and research in bioacoustics should be conducted [4,5].
There are obvious characteristics of animals that can be used as sensors for the early detection of what are most concerned. Behavior [6,7] and sound [8][9][10] are both such characteristics. It is probable to make use of sound to detect some phenomena, no matter which acoustic field it is [11]. If things produce sounds, there are existing ways to interpret and predict possible meanings conveyed by sound. For example, things that are in good condition sound energetic, whereas they sound obviously weak under bad conditions. As a specific example, emission with a choking sound as well as abnormal acoustic/vibration energy of automobiles might indicate some problems [12]. By the same token, different sound features may symbolize corresponding characteristics of an entity. This also explains why sounds are commonly used by animals to communicate or interact [8,13]. From practical experience, distinct animal acoustic features reveal different contents of meaningful information [14]. Furthermore, the same information conveyed by a particular sound can be transformed into other appearances without loss of its meaning. For example, even when they are under water, sounds generated by some animals are converted to a series of bubbles [15,16]. Owing to the mass acoustic communication among animals, automated acoustic monitoring can not only give an appropriate way to survey different species in their natural way of life, but can also provide a convenient and cost-effective way to monitor target species efficiently. At the same time, it can reduce the need of manual monitoring [17,18]. Nowadays, bioacoustic feature classification has become a quite useful tool for experts to collect and process data to generate useful information for monitoring the ecological system [19]. The goal is to look for clues for understanding the bioacoustic features of an animal based on the perspective of its bioacoustic mechanisms. Remotely monitoring living creatures on a real-time basis allows us to gather significant acoustic data information, which can be used not only for predicting changes of environment but also observing other relevant phenomena like global warming, animal extinction, natural disasters, and various diseases [20,21]. Research in this field has been mushrooming. According to the information released from the Biodiversity Research Center, Academia Sinica, a study called "Deeply listening to diverse creature under extreme climate" is now conducted for detecting acoustic signals and collecting data to make a training dataset model. Related equipment is shown as Figure 1.
Because animals living in nature are quite sensitive to the environment, it is natural for them to react quickly to the environment they are in [22]. This suggests the possibility to predict coming natural phenomena from observing the bioacoustic changes of animals [23,24]. Take frogs as an example: before it rains, frog will croak much louder. Also, experts have gradually begun to focus on the topic of automated identification of animal sounds [25] because animal sounds are relatively easy to recognize [14]. Therefore, we hope that bioacoustic changes could help us predict the dynamics of natural phenomena.
Recently, the world has entered an era of AI. Big data, the basis of AI, becomes a critical tool to establish an optimal decision model which generates samples from the big data for machines to predict the best solution. Datasets are used to develop machine training to make the model get familiar with the trend. Training machines to learn would keep correcting the error value toward its minimization for fitting the best situation until the model is optimized. Machines play an important role in any industrial process of the modern digital world. Various methods have been applied for assuring machine operations by collecting and analyzing many different types of useful information such as vibration, acoustic, and temperature trends [26]. In our experiment, we choose fifteen frog calls to be analyzed by our model [27]. Frog calls are easy to hear and collect, and often used to provide different useful classifications that are intended [28]. The reasons are threefold. The first reason is that the spectrogram structure of frog signals is relatively simple and can be in the form of frequency tracks or harmonic waves, making spectral features clear to recognize under a complicated environment. The second is that the basic vocal unit of frog call syllables is often short and includes different durations in the classification [29]. Finally, automatic exploration and collection of acoustic data are highly convenient and effective, having given rise to many researches [30][31][32]. monitoring the ecological system [19]. The goal is to look for clues for understanding the bioacoustic features of an animal based on the perspective of its bioacoustic mechanisms. Remotely monitoring living creatures on a real-time basis allows us to gather significant acoustic data information, which can be used not only for predicting changes of environment but also observing other relevant phenomena like global warming, animal extinction, natural disasters, and various diseases [20,21]. Research in this field has been mushrooming. According to the information released from the Biodiversity Research Center, Academia Sinica, a study called "Deeply listening to diverse creature under extreme climate" is now conducted for detecting acoustic signals and collecting data to make a training dataset model. Related equipment is shown as Figure 1. Machine learning is known as the brain of AI, meaning that it could deduce a mode from a large amount of data and that it is capable of analyzing and learning from the known data and then making possible predictions based on testing data [33,34]. It also provides researchers in classical statistics with extra data modelling techniques [6]. Such an automatic learning process concept has its assured place in the modern world. In case the output space has no structure except if two components from the output are equal or not, it is called the problem of classification learning. Each component of the output space is called a class. The learning algorithm solving the classification problem is called classifier. The task of such classification problems is to assign new inputs to one of a few discrete classes or categories. This problem characterizes most pattern identification tasks [35].
In this paper, we will particularly describe how computation methods of machine learning are able to contribute to problem-solving in the bioacoustic field. Since there might be the case that some parts of spectrum signals of the collected data are not robust enough, a pre-emphasis process is used to improve the quality of the signals in such a way that these signals will be set to go through a high-pass filter to enhance the frequency magnitude feature. Meanwhile, it can also filter from outside environment some signals that are considered to be useless while keeping the rest intact. Then, a well-known speech algorithm, the Mel-frequency cepstral coefficients (MFCC), is used to distinguish between speech technologies [36,37]. Using such an algorithm, original signals from frog sounds are transformed into spectrum [38]. When useful spectrum signals are all gathered to form a big dataset, the dataset can serve as a training model for the machine. If the model is stably robust, then it is appropriate to apply the model to identify and classify new datasets [39]. Accordingly, we provide some unknown datasets for two processors, central processing unit (CPU) and graphics processing unit (GPU), to perform the sound identification of the frog species and further classify the characteristics of each frog to get the result. Nevertheless, there are pros and cons between the two kinds of processor, which will be discussed.
One of the often-applied algorithms is neural networks [40]. Neural networks are able to learn patterns or relationships from training big data and generalize what they have learned, and then they will extract expected results from the resulting data [41,42]. The approximation calculation of neural networks is done on the basis of connectionism. After going through the model training process, the network is a kind of machine that approximately leads inputs to desired outputs [43]. In our experimental process of classification, the feedforward neural network approach (FNNA) is adopted to conduct machine learning [44,45]. A feedforward neural network is a set of connected neurons, where information only flows in the forward direction, from inputs to outputs [46]. The other algorithm adopted in the experiment is support vector machine (SVM). In machine learning, the support vector machine method belongs to supervised learning models in connection with learning algorithms that analyze data and identify patterns, used for regression and classification analysis [47,48]. Additionally, MATLAB is applied in the experiment and some of its notable features such as time duration, algorithms, and efficiency will be discussed.
Relevant to the study, there have been numerous researchers investigating he information content of vocal tone. For example, the connection between body mass and frequency used in vocal performances has been edited by Wallschlager (1980). Similar results have been discussed by Boeckle et al. (2009) for 76 species of frogs where 25% of the dominant frequency has been announced by body size [49,50].
Once we are able to analyze bioacoustic features for various purposes, we can further apply the technique to the analysis of other kinds of sound. The analysis of sound can be further developed to deal with issues such as the problem of voice disorders and the detection of diseases related to diseases of the heart or the lung and so on [51][52][53]. Therefore, we hope that this kind of bioacoustic research could be practically applied in the biomedical field for saving lives. For example, through collecting information on one's breath, we can monitor breath features and identify breathing movements [9]. This will help provide more effective medical therapy and improve the quality of human health care. We, therefore, expect that results of this research can lay a strong foundation for these applications.

Bioacoustic Feature Extraction
In the beginning, it is necessary to start the analysis with an algorithm called Pre-emphasis filtering. The equation is defined as: the primary reason for using this concept is that as the vocal cord vibrates, the vocal-cord side is regarded as a series of pulse signals passing through a glottal shaping filter. The output of airflow velocity waveform has a characteristic of −12 dB/oct high frequency attenuation. At the lip side, it is seen as having a radiation impedance, and the signal produces a high-pass filtering effect with a high-frequency enhancement of 6. For a discrete-time signal of a speech, we use a fixed-length window to observe the signals inside the window and analyze them. Finally, speech features can be identified. If we choose a rectangular window function, original signals will be retained inside the window but will be set to be zero outside, making signals of both sides become discontinuous like they were cut off. Discontinuity of both sides may result in extra sound, and from the frequency domain, the voice spectrum will be destroyed. In order to avoid this, the Hamming-window function is appropriately adopted. It will let the window extract the signals and two sides could slowly decrease [54].
The bioacoustic signals of frogs must first be processed by a discrete time conversion mechanism. Then the discrete time signals are transferred to spectrum domain. The cepstrum of a signal is the inverse discrete-time Fourier transform (IDTFT) of the logarithm of the magnitude of the DTFT of the signal, which is defined as: or where X e iω = ∞ n=−∞ x[n]e −iωn is the spectrum.
C[n] is a discrete function of index n apparently. If an input sequence x[n] is generated by sampling an analog signal, we can consider x[n] = x a (n/f s ) and associate index n with time in this transformation.
Suppose that a linear-filtered kind of bioacoustic signal takes the following form: . If the analysis window is long in comparison with the length of h d [n], the short time interval cepstrum of a frame of the filtered bioacoustic signal y[n] will probably take the following form: where c (h d ) [n] will come out more or less the same in every frame. Therefore, if we can estimate the value of c (h d ) [n], which is assumed to be non-time-varying, we may get c . Another way to remove the influence of linear distortions is applying the mechanism of cepstrum to every window because the distortions are the same in each frame. Therefore, the influence of linear distortions could be gotten rid of by an easy first difference operation: It is obvious that if c (y) , meaning that the linear distortion effects are disappeared. The fact that the weighted cepstrum distance measure conveys exactly the same practical meaning as the distance measure in frequency domain is very important for models of human perception of sound and provides a basis for the frequency analysis conducted in the inner ear. Because of this, the Mel-frequency cepstrum coefficients method is born.
As just shown, a short-time Fourier analysis goes first, and the outcome is the DFT values for the m th frame. Then the DFT values are gathered for each band and weighted by a triangular weighting function. The Mel-spectrum of the m th frame is defined as: where V r [k] is the weighting function for the r th filter.
is a normalizing factor of the r th Mel-filter. The normalization is needed so that a flat Mel-spectrum can be produced by an ideally flat input Fourier spectrum. For each frame, a discrete cosine transformation (DCT) of the Mel-filter output is computed to generate the function mfcc[n] shown below: mfcc[n] is evaluated for a number of coefficients N mfcc .

Feedforward Neural Network Approach
Machine learning is seen as the process of using a resource-based calculation to implement some learning algorithms. Actually, machine learning is defined as a complex computation process of automatic pattern identification and intelligent decision-making embedded in the process of training sample data. A feedforward neural network approach, widely used for supervised bioacoustics classification, can be shown as Figure 2. Machine learning is seen as the process of using a resource-based calculation to implement some learning algorithms. Actually, machine learning is defined as a complex computation process of automatic pattern identification and intelligent decision-making embedded in the process of training sample data. A feedforward neural network approach, widely used for supervised bioacoustics classification, can be shown as Figure 2. Therefore, feedforward multilayer networks with a sigmoid nonlinear function are often termed multilayer perceptrons (MLP). Machine learning methods can be classified into three groups: supervised learning, unsupervised learning, and reinforcement learning. In this research, we use the FNNA structure to carry out the proposed, but only part of, supervised learning. The structure is defined [55][56][57][58] as follows: The feedforward neural network is a general machine learning method which transforms inputs into outputs that will be in line with targets. Through non-linear signal processing in a random number of connected groups of artificial neurons, the so-called hidden layers are formed. When the FNNA is used, it is important to control a set of weights to minimize the error in the process of classification. A useful instrument commonly seen in many learning ways is the least mean square (LMS) convergence criterion. The goal of feedforward neural network is narrowing the gap between the ground truth Y and the output f(X; W) of the ground truth by applying E(X) = (Y − f(X; W)) 2 . The procedure of such an approach relies on both the weighting scheme and the transfer function T f , which are essential to the connections between neurons. The general function of the feedforward neural network approach is described below: 2.2.2. Support vector machine approach Therefore, feedforward multilayer networks with a sigmoid nonlinear function are often termed multilayer perceptrons (MLP). Machine learning methods can be classified into three groups: supervised learning, unsupervised learning, and reinforcement learning. In this research, we use the FNNA structure to carry out the proposed, but only part of, supervised learning. The structure is defined [55][56][57][58] as follows: The feedforward neural network is a general machine learning method which transforms inputs into outputs that will be in line with targets. Through non-linear signal processing in a random number of connected groups of artificial neurons, the so-called hidden layers are formed. When the FNNA is used, it is important to control a set of weights to minimize the error in the process of classification. A useful instrument commonly seen in many learning ways is the least mean square (LMS) convergence criterion. The goal of feedforward neural network is narrowing the gap between the ground truth Y and the output f(X; W) of the ground truth by applying E(X) = (Y − f(X; W)) 2 . The procedure of such an approach relies on both the weighting scheme and the transfer function T f , which are essential to the connections between neurons. The general function of the feedforward neural network approach is described below:

Support Vector Machine Approach
The support vector machine combines machine learning methods and statistical methods, aiming to generate a mapping from training dataset for establishing a route between input and output. Basic concept can be expressed as Figure 3. The support vector machine combines machine learning methods and statistical methods, aiming to generate a mapping from training dataset for establishing a route between input and output. Basic concept can be expressed as Figure 3.
It finds the hyperplane which maximizes the separating margin between the two classes [59]. This hyperplane could be found by minimizing the cost function shown below: subject to the following separability constraints: A slack variable could be used to relax the separability constraints: The linear support vector machine with a separating hyperplane can classify things only into two classes, which is evidently not sufficient for making medical predictions where classifications into several classes are often needed. When features of interest in the sample space could not be separated by only hyperplanes, nonlinear techniques should be called for. Therefore, the nonlinear form of such an algorithm is usually applied for complex applications in the real world. Now consider the input vector ∈ that is transformed into the feature vector ( ) through a nonlinear mapping ∶ → . Then the problem is solved by assuming a kernel function ∶ × → defined as: where and indicate any pair of input vectors. Thus, the optimal separating contours are defined according to the function given by: where and are scalars and rely on and , = , … , . The support vector machine offers great generalization capabilities. It is robust for high It finds the hyperplane which maximizes the separating margin between the two classes [59]. This hyperplane could be found by minimizing the cost function shown below: subject to the following separability constraints: A slack variable could be used to relax the separability constraints: The linear support vector machine with a separating hyperplane can classify things only into two classes, which is evidently not sufficient for making medical predictions where classifications into several classes are often needed. When features of interest in the sample space could not be separated by only hyperplanes, nonlinear techniques should be called for. Therefore, the nonlinear form of such an algorithm is usually applied for complex applications in the real world. Now consider the input vector x ∈ R d that is transformed into the feature vector Φ(x) through a nonlinear mapping Φ : R d → R . Then the problem is solved by assuming a kernel function K : R d × R d → R defined as: where x i and x j indicate any pair of input vectors. Thus, the optimal separating contours are defined according to the function f given by: where α k and β are scalars and rely on x k and y k , k = 1, . . . , l. The support vector machine offers great generalization capabilities. It is robust for high dimensional data, well-suited to conduct training, and performs well in comparison with traditional artificial neural networks. However, the support vector machine is very sensitive to uncertainties. The larger the dimensions of the space, the higher the probability the support vector machine will be trapped into a lengthy learning process. For some real-time applications, the evaluation of function f may be difficult to manage. Therefore, a balance has to be maintained between the generalization properties of the SVM and its sluggishness when faced with learning from many large databases. In brief, the key is making a good choice of the related input variables from the dataset so that dimensions of the space can be reduced, and approximation functions can be efficient and accurate [60].

Results and Verification
The analysis is mainly carried out by applying a digital MFCC algorithm and using the acoustic sensors. In Figure 4, we establish an experimental structure of our research arrangement process.

Results and Verification
The analysis is mainly carried out by applying a digital MFCC algorithm and using the acoustic sensors. In Figure 4, we establish an experimental structure of our research arrangement process.  Our original information on frog sound is from the digital learning website [27], and their sound information files are shown in Figure 5. Through adjusting pre-emphasis coefficient "a", twenty-five filtered features of each frog are shown in Figure 6.  Our original information on frog sound is from the digital learning website [27], and their sound information files are shown in Figure 5. Through adjusting pre-emphasis coefficient "a", twenty-five filtered features of each frog are shown in Figure 6.   The MFCC filtering algorithm is applied to the transformation from time-domain signal to special spectrum features as shown in Figure 7. Also, a dataset for training is formed and shown in Figure 8. Odorrana-swinhoana; the blue one indicates that of Rana-okinavana; the black one indicates that of Rana-guentheri.
The MFCC filtering algorithm is applied to the transformation from time-domain signal to special spectrum features as shown in Figure 7. Also, a dataset for training is formed and shown in Figure 8.  When preprocessing is done, the key analysis follows. The feedforward neural networks algorithm is used to train the bioacoustic model datasets of frog. The computation platform we use for conducting the analysis is "MATLAB R2019a-academic use" authorized by National Cheng Odorrana-swinhoana; the blue one indicates that of Rana-okinavana; the black one indicates that of Rana-guentheri.
The MFCC filtering algorithm is applied to the transformation from time-domain signal to special spectrum features as shown in Figure 7. Also, a dataset for training is formed and shown in Figure 8.  When preprocessing is done, the key analysis follows. The feedforward neural networks algorithm is used to train the bioacoustic model datasets of frog. The computation platform we use for conducting the analysis is "MATLAB R2019a-academic use" authorized by National Cheng When preprocessing is done, the key analysis follows. The feedforward neural networks algorithm is used to train the bioacoustic model datasets of frog. The computation platform we use for conducting the analysis is "MATLAB R2019a-academic use" authorized by National Cheng Kung University, Tainan, Taiwan. And the two processors adopted for classification are GPU and CPU. For each processor, two optimizer functions, "traingda" and "trainscg", are executed. Going from the current layers and the next layers of neurons, many comparisons and tests are conducted so that the two processors can learn and attempt to predict an expected model. If the new testing data become the input, this algorithm structure can monitor the data information and predict the classification results by the learning program for training datasets. The calculation of machine learning contains a concept called the "Deep Feedforward Neural Network" method. Furthermore, the "back-propagation algorithm" is used to update the weighting scheme. In the deep learning computation procedure, the back-propagation algorithm will be carried out by iterations not only to minimize the loss function but to get close to the best performance of the testing model. With regard to the optimizer functions, the "traingda" (GDA) is a neural network optimizer function that resets the weighting scheme and the bias value based on gradient descent with an adaptive learning rate (LR) for an optimal change in the learning rate during the training process. This can allow the computing condition to keep stable. In short, the "traingda" function can automatically alter the LR by adapting to the convergence factor from the error values.
Each variable is adjusted according to gradient descent. At each epoch, if performance drops toward the goal, then the LR is increased by the factor "lr_inc". In contrast, if performance becomes larger than the factor " max_perf _inc", the LR is revised down by the factor "lr_dec". Also, a general fixed learning rate η is then replaced by an adaptive learning rate: The "trainscg" is a network optimizer function which renovates bias values and weights according to the scaled conjugate gradient (SCG) method. It trains any network as long as its net input, weight, and transfer functions have derivative functions. Backpropagation is used to calculate derivatives of performance "perf" with respect to the weights and bias vectors x.
There are three important diagrams from GPU and CPU: Regression, Performance, and Training State. In Regression, the slope (or we can say efficiency), marked as R, of the linear regression shows us the level this machine training is able to achieve. The point of Performance is to see green circle if it can catch up with where the epochs are now. Regarding the Training State, it indicates whether the efficiency of the analysis is going up or down. Basically, we try to improve the accuracy of the classification by changing the training function and processor. The rest of the set of parameters are the same. Related parameters are set up as the learning rate, 0.00008, the initial epoch, 30 Table 1. Comparatively, Table 2 makes the comparison in total time between the feedforward neural network and the support vector machine. hat the GDA function needs more iteration checks to balance the unsteady g adient could converge. Fourth, as shown in Table 1, the total time taken by com GPU and either of the two optimizers is smaller than that taken by other com that the computation rate of processor GPU is faster than that of proce re, Table 2 compares the total time taken by neural networks with that taken b chine, and the result shows that neural networks take slightly less time tha hine.
9. This is the regression results for processor graphics processing unit (GPU) with opti n gradient descent adaptive learning rate (GDA). There are four score values whic g score, validation score, test score, and all score. Regardless of the score types, th to approach the Y = T line that indicates the full percent of training correctness. Figure 9. This is the regression results for processor graphics processing unit (GPU) with optimizer function gradient descent adaptive learning rate (GDA). There are four score values which are training score, validation score, test score, and all score. Regardless of the score types, they all appear to approach the Y = T line that indicates the full percent of training correctness. Symmetry 2019, 11, x FOR PEER REVIEW 16 of 25 (a) (b) Figure 10. (a) Shows the best validation performance of processor GPU with optimizer function GDA. In addition to the convergence condition of the mean squared error, the green circle also represents whether the training could catch up with the epochs. The same concept with different settings is shown in Figures 13a, 16a and 19a; (b) This is the training state of GPU with the optimizer function GDA. Because of the adaptive learning rate, optimizer function GDA can adjust the learning rate by itself to approach the best regression but it will take a period of time. Besides, the gradient line indicates how quickly it reduces the error and converges to the goal throughout the computation and the same concept can also be seen in Figures 13b, 16b and 19b. In addition to the convergence condition of the mean squared error, the green circle also represents whether the training could catch up with the epochs. The same concept with different settings is shown in Figures 13a, 16a and 19a; (b) This is the training state of GPU with the optimizer function GDA. Because of the adaptive learning rate, optimizer function GDA can adjust the learning rate by itself to approach the best regression but it will take a period of time. Besides, the gradient line indicates how quickly it reduces the error and converges to the goal throughout the computation and the same concept can also be seen in Figures 13b, 16b and 19b.                     . This is the classification diagram of support vector machine (SVM). Each frog is regarded as a class. By using the nonlinear SVM algorithm, here the first frog is chosen to be the standard point and the other frogs could be classified based on the first frog. The x-axis represents that feature values from twenty-five pre-emphasis "a" has selected with 350 points while y-axis shows the classified condition shown as a ratio. Zero ratio indicates the hyperplane. The ratio could give you the relative possibility to identify each frog (ratio in negative sign is just like positive sign).

Conclusions
This study applies artificial intelligence (AI) to frog species classification of symmetry of the bioacoustics spectral. The feedforward neural networks structure is key to carry out the feature extraction and effective classification. This research uses the bioacoustic signals and the machine learning method to identify twenty-five frogs. Our study proposes an acoustic signal processing MFCC method and a classification system based on the feedforward neural networks approach.  Figure 22. This is the classification diagram of support vector machine (SVM). Each frog is regarded as a class. By using the nonlinear SVM algorithm, here the first frog is chosen to be the standard point and the other frogs could be classified based on the first frog. The x-axis represents that feature values from twenty-five pre-emphasis "a" has selected with 350 points while y-axis shows the classified condition shown as a ratio. Zero ratio indicates the hyperplane. The ratio could give you the relative possibility to identify each frog (ratio in negative sign is just like positive sign).

SVM classifier optimizer index
Although the four combinations of processors and optimizers appear to have high and similar R-scores, there still exist some differences. First, due to the adaptive learning rate, optimizer function GDA can adjust the learning rate by itself to approach the best regression. Second, in Figure 10b, it is obvious that the gradient with function GDA goes up and down, appearing to be quite unstable. However, when Figure 10b is compared with Figure 16b, the gradient with function SCG converges more stably, and Figures 10a and 16a confirm the observation. Third, in Figure 10b, the validation check raises dramatically but in Figure 16b it stays almost unchanged, meaning that the GDA function needs more iteration checks to balance the unsteady gradient so that the gradient could converge. Fourth, as shown in Table 1, the total time taken by combinations of process GPU and either of the two optimizers is smaller than that taken by other combinations, suggesting that the computation rate of processor GPU is faster than that of processor CPU. Furthermore, Table 2 compares the total time taken by neural networks with that taken by support vector machine, and the result shows that neural networks take slightly less time than support vector machine.

Conclusions
This study applies artificial intelligence (AI) to frog species classification of symmetry of the bioacoustics spectral. The feedforward neural networks structure is key to carry out the feature extraction and effective classification. This research uses the bioacoustic signals and the machine learning method to identify twenty-five frogs. Our study proposes an acoustic signal processing MFCC method and a classification system based on the feedforward neural networks approach.
The proposed method is to perform bioacoustic detection for categorization of animal species and use neural networks equipped with machine learning methods on the basis of training datasets to compute and develop the best model for making meaningful predictions. It is a well-suited way to understand anything that can produce sound. Through bioacoustic detection, making predictions of things that are of primary concern based on voice signals becomes feasible and meaningful. In the long term, once the method is fully developed, it is possible to improve these skills in any field.
The experimental results of this study show that two optimizer functions and two processors could effectively perform predictions and reliable analyses on different frogs. The optimizer function SCG can achieve better identification and classification. In addition, the result of this research indicates that we can directly use original data as input for machine learning. In summary, our study demonstrates that effective machine learning through big data can be reached by using the recurrent neural network (RNN) algorithm of a time-variant method. The AI algorithm helps search for distinctive and interesting features, make meaningful classification, and identify feasible feature models corresponding to various animal bioacoustics conditions.
In the future work, we will try to conduct a similar experiment by using different machine learning methods and algorithms, but the dataset will include thirty-five kinds of frogs. The wavelets algorithm is also a very useful feature extraction tool in frequency domain analysis. However, the new machine learning that the recurrent neural networks (RNN) and long short-term memory work (LSTM) methods bring can be available to apply to nature language identification. In addition to classifying the same animal, it is possible for us to advance the classification process by including many kinds of animals. As the methods of acoustic detection could be used widely, we hope that it can lead to an improvement in the medical field so that we can avoid time-consuming therapies and add some acoustic detection methods to analyze the symptoms more efficiently, such as heartbeat and the sound of flowing blood.