Evaluating the classification of Fermi BCUs from the 4FGL Catalog Using Machine Learning

The recently published fourth Fermi Large Area Telescope source catalog (4FGL) reports 5065 gamma-ray sources in terms of direct observational gamma-ray properties. Among the sources, the largest population is the Active Galactic Nuclei (AGN), which consists of 3137 blazars, 42 radio galaxies, and 28 other AGNs. The blazar sample comprises 694 flat-spectrum radio quasars (FSRQs), 1131 BL Lac-type objects (BL Lacs), and 1312 blazar candidates of an unknown type (BCUs). The classification of blazars is difficult using optical spectroscopy given the limited knowledge with respect to their intrinsic properties, and the limited availability of astronomical observations. To overcome these challenges, machine learning algorithms are being investigated as alternative approaches. Using the 4FGL catalog, a sample of 3137 Fermi blazars with 23 parameters is systematically selected. Three established supervised machine learning algorithms (random forests (RFs), support vector machines (SVMs), artificial neural networks (ANNs)) are employed to general predictive models to classify the BCUs. We analyze the results for all of the different combinations of parameters. Interestingly, a previously reported trend the use of more parameters leading to higher accuracy is not found. Considering the least number of parameters used, combinations of eight, 12 or 10 parameters in the SVM, ANN, or RF generated models achieve the highest accuracy (Accuracy $\simeq$ 91.8\%, or $\simeq$ 92.9\%). Using the combined classification results from the optimal combinations of parameters, 724 BL Lac type candidates and 332 FSRQ type candidates are predicted; however, 256 remain without a clear prediction.

1. INTRODUCTION Some of the most luminous sources in the extragalactic γ-ray sky are blazars, which are a sub-class of active galactic nuclei (AGN).Their multi-wavelength spectral energy distributions (SEDs) often exhibit a bimodal shape spanning the logν − logνF ν space and cover the entire electromagnetic spectrum (e.g., from radio to γ-ray bands).Their spectral energy is dominated by non-thermal emissions, and the origins of this energy are thought to be a relativistic jet at a small viewing angle with respect to the line of sight (Urry & Padovani 1995).The lower energy part of the distribution is attributed to the synchrotron emission produced by the non-thermal electrons in the jet; this peaks within the millimeter to soft X-ray waveband.The higher energy part of the distribution is attributed to inverse Compton (IC) scattering; this peaks within the MeV to GeV range.Blazars are further classified into two categories, according to absence or presence of weak emission lines in their optical spectra (Urry & Padovani 1995).
Recently, a new version of the fourth Fermi Large Area Telescope source catalog (4FGL) has been released (The Fermi-LAT collaboration 2019a).This version expands upon the first eight years of science data from the Fermi Gamma-ray Space Telescope mission in the energy range from 50 MeV to 1 TeV (The Fermi-LAT collaboration 2019a,b).The new 4FGL catalog includes 5065 sources above 4σ significance, where most of these sources are AGNs.The largest source population of AGNs includes 3137 blazars (e.g., 98%), 42 radio galaxies, and 28 other AGNs; More than 2860 AGNs are located at high Galactic latitudes (|b| > 10 • ).The blazars in the 4FGL catalog include 694 flat-spectrum radio quasars (FSRQs), 1131 BL Lac-type objects (BL Lacs), and 1312 blazar candidates of unknown type (BCUs).
Classified as FSRQs and BL Lacs in 4FGL catalog are the sources that their optical classifications have been well evaluated from the literature and/or optical spectrum in the 4FGL catalog (see The Fermi-LAT collaboration 2019a,b).BCUs are the sources that are classified as BZU (blazars of uncertain type) object in the BZCAT catalog, and/or as a source displays a flat radio spectrum, a typical two-humped, blazar-like SED in one or more of the WISE, AT20G, VCS, CRATES, PMN-CA, CRATES-Gaps, or CLASS catalogs or in radio and X-ray catalogs (see The Fermi-LAT collaboration 2019a,b for the details and references therein).
The Fermi catalog (each version, such as First, Second, Third and Fourth Fermi Large Area Telescope source catalogs, denoted as 1FGL, 2FGL, 3FGL or 4FGL) is by far the largest gamma ray source group (when released); it provides a unique and excellent opportunity for investigating the physics of the γ-ray emissions of blazars (e.g., Singal et al. 2012;Xiong & Zhang 2014;Singal 2015;Xiong et al. 2015a,b;Chen et al. 2016;Fan et al. 2016a,b;Ghisellini 2016;Lin & Fan 2016;Xiong et al. 2016;Lin et al. 2017;Chen 2018;Kang et al. 2018;Lin & Fan 2018;Kang et al. 2019.)With respect to current techniques that have been used in the study of astronomy and astrophysics (e.g., Baron 2019;Longo et al. 2019;Faisst et al. 2019;Kang et al. 2019), powerful data mining and machine learning algorithms have been widely adopted.Several literature reviews are available on this topic (Ball & Brunner 2010;Feigelson &Babu 2012 andWay et al. 2012).The use of supervised machine learning (SML) algorithms, for example, has been extensively explored using the earlier versions of the Fermi catalogs (e.g., 1FGL, 2FGL, and 3FGL).For instance, using the 1FGL catalog, Ackermann et al. (2012) identified 221 AGN and 134 pulsar candidates from the 630 unassociated sources; this was accomplished using a random forest (RF) and a logical regression multivariate method.Using the 2FGL catalog, Mirabal et al. (2012) identified 216 AGN candidates from 269 unassociated sources (located at high galactic latitude |b| > 10 • ); this research adopted an RF method.Hassan et al. (2013) also used the 2FGL catalog; this research applied a support vector machine (SVM) and RF methods to identify BL Lac or FSRQ candidates from the 269 BCUs in 2FGL catalogue.In addition, Doert & Errando (2014) applied an artificial neural network (ANN) and RF algorithms to identify "AGN"or "non-AGN"from 576 unassociated sources in the 2FGL catalog.Using the 3FGL catalog, Chiaro et al. (2016) identified 314 BL Lac candidates and 113 FSRQ candidates among the BCUs; this was accomplished using an ANN algorithm.Saz Parkinson et al. ( 2016) also used the 3FGL catalog; this research identified 334 pulsar candidates and 559 AGN candidates from the unassociated sources using a RF and a logistic regression algorithm.Einecke (2016) search for high-confidence Blazar Candidates based 3FGL, an infrared and an X-ray catalog using a RF algorithm.In addition, Lefaucheur & Pita (2017) firstly identified blazar candidates from the 3FGL unassociated sources; they subsequently classified BL Lacs or FSRQs from these candidates and the BCUs reported in 3FGL using multivariate classifications.Furthermore, Salvetti et al. (2017) used the 3FGL catalog and identified BL Lacs and FSRQs from 559 3FGL unassociated sources using an ANN algorithm.
Using a 3LAC clean sample selected from the 3FGL catalog, Kang et al. (2019) (Paper I) identified BL Lacs and FSRQs using 4 different machine learning algorithms (Mclust Gaussian finite mixture models, Decision trees, RF, and SVM).This research used eight parameters.In the work summarized above, only a subset of the parameters are selected for supervised learning based on certain selection criteria to evaluate the classification of the Fermi sources.Taking a different approach, Xiao et al. (2019) has recently compiled an extensive collection of parameters to utilize in the application of supervised learning.Even the astronomical coordinates of the Fermi source are considered in the initial stage of the supervised learning to evaluate the potential optical classification of BCUs.Based on the analysis of the research, in particular related to the selection of parameters, three open research questions are identified regarding the classification of blazars: RQ1 Do all possible parameters need to be considered in the application of SML algorithms?; RQ2 Does the use of more parameters improve the accuracy of SML algorithms?; RQ3 Does a single optimal combination of parameters exist for SML algorithms?
Three popular SML algorithms (RF, SVM, and ANN, e.g., see Baron 2019 for the reviews) are utilized in this research to investigate the potential classification of BCUs.The 4FGL catalog is used, which provides the direct observational gamma-ray properties.Section 2 describes the method used to select the parameters and data sample from the catalog.In Section 3, the three SML techniques used in the research are briefly introduced.The classification results are reported in Section 4. Section 5 presents the discussion and conclusions of the research results.

SAMPLE
In the updated table released on 30 September 2019 (4FGL FITS file "gll psc v20.fit"1 ) of the 4FGL catalog, 84 variables are reported using 333 columns (The Fermi-LAT collaboration 2019a, Table 12).Among the 84 variables, some variables contain multiple columns.The data includes the parameters "Flux Band" (seven columns are used to present the integral photon flux in each of the seven spectral bands that are marked as Flux Band1, Flux Band2 ... respectively, see Table 1); "Unc Flux Band"(14 columns are used to present the 1σ lower and upper error for each "Flux Band"); "nuFnu Band"(seven columns are used to present the SED for the spectral bands, marked as nuFnu Band1, nuFnu Band2 ... respectively, see Table 1); "Sqrt TS Band (seven columns are used to represent the square root of the test statistic for the spectral bands).Historical parameters include the "Flux History", "Unc Flux History" and "Sqrt TS History", which use 8, 16, and 8 columns, respectively to present the annual integral photon flux from 100 MeV to 100 GeV, 1σ lower and upper error on the integral photon flux, and the square root of the test statistic; and the "Flux2 History", "Unc Flux2 History" and "Sqrt TS2 History", which use 48, 96, and 48 columns, respectively to present the each two-moth integral photon flux from 100 MeV to 100 GeV, 1σ lower and upper error on the integral photon flux, and the square root of the test statistic.
There are three main steps that are used to create the data sample.The first step is to identify the subset of parameters and their associated data.To achieve this, the coordinate columns, error columns, string columns, and most data missing columns (e.g.,"Sqrt TS Band") are removed.This results in selecting 34 candidate parameters (see Table 1) from the 4FGL table.In order to simplify the calculation, some parameters are pre-selected for the SML algorithms.To accomplish this, three two sample tests are used to calculate the independence of the 34 parameters.These tests are the Kolmogorov-Smirnov test, Welch Two Sample t-test, and Wilcoxon rank sum test with continuity correction (e.g., Acuner & Ryde 2018;Kang et al. 2019).The tests are applied to two subsamples of the data (694 FSRQs and 1131 BL Lacs); the results are summarized in Table 1.Considering p > 0.052 , one parameter ("PLEC Exp Index") is excluded; therefore, 33 parameters are selected in this work.
The second step is to ensure the parameter selection is reasonable.This is accomplished by applying a RF algorithm (see Section 3) to the parameters'data to compute the Gini coefficients (see Liaw & Wiener 2002;Breiman et al. 2003 for the details and references therein), which is an established method to determine the variables'importance.These results are presented in Table 1; they are consistent with those of the two sample tests.Based on the selected 33 parameters, a subset of the data is selected from the 4FGL catalog, which includes 3137 blazars (1131 BL Lacs, 1312 BCUs, and694 FSRQs).The catalog has a total of 1312 BCUs; these are listed in Table 3.
The third step is to further reduce the number of parameters, to ensure the study can be completed.For the selected 33 parameters, there are 8589934591 different combinations.Assuming that we utilize the RF, SVM, and ANN to calculate each combination of parameters, the computer (with 4 cores) requires approximately 1/4 second to accomplish this.In this scenario, computing all of the combinations costs approximately 24855 days, i.e., over 68 years.This is not possible for us to accomplish.In order to reduce the calculation time, we further sub-selected 23 parameters (8388607 different combinations, requiring approximately 24 days) to carry out our work.This sub-selection is also based on the two sample test results (see Table 1), by considering D > 0.30 in the Kolmogorov-Smirnov test, or p 3 < 1.00E − 20 in the Wilcoxon rank sum test.A simple horizontal line is introduced in the table to distinguish the collection of 23 parameters utilized.Note-Column 1 presents the parameter labels in the sample.Column 2 lists the selected parameters.The two-sample KolmogorovSmirnov test results for the test statistic (D) and the p-value(p1) are presented in Columns 3 and 4, respectively.The Welch Two Sample t-test results for the t-statistic(t), the degrees of freedom for the t-statistic (df), and the p-value(p2) are presented in Columns 5, 6, and 7, respectively.The Wilcoxon rank sum test with continuity correction test statistic (W ) and p-value (p3) are presented in Columns 8 and 9, respectively.The Gini coefficient (Gini), an indicator of variable importance in RFs are presented in Column 10.

SUPERVISED MACHINE LEARNING ALGORITHMS
In this section, a brief introduction to the popular, well established SML algorithms that are adopted in this work is provided, including RF, SVM, and ANN (e.g., see Baron 2019 for the reviews).These approaches share a common, high level approach to establish and assess the accuracy of predictive models.They divide a dataset into training, validation, and forecast samples.The training and validation samples contains predictive variables and known outcomes; the forecast sample only contain predictive variables.SML algorithms use the training sample to generate a predictive model; the accuracy of the model is evaluated using the separate validation sample.An accurate model can subsequently be used to predict the outcomes of the forecast datasets, for which the outcomes are not known a priori (Feigelson & Babu 2012;Kabacoff 2015).
Beyond the high level, common approach of dividing the dataset to establish and assess predictive models, the RF, SVM, and ANN algorithms have distinct characteristics.The original RF proposal (Breiman 2001), which has evolved over time, transforms a training sample into a large collection of decisions trees (e.g., a forest).These trees are used to conduct an extensive voting scheme, which enhances the classification and the prediction accuracy of the model.The RF algorithm has numerous advantages, including accuracy, scalability, and the ability to address challenging datasets.In terms of accuracy, the RF approach has outperformed alternative approaches, for example, decision trees (Fernández-Delgado et al. 2014).This approach has been applied to a very large astronomical dataset (Breiman et al. 2003).RF successfully builds predictive models for uneven datasets, for example, those with large amounts of missing data or a relatively limited ratio of observations in comparison to the number of variables.The RF approach also generates out-of-bag error rates, in addition to measures indicating the relative importance of the variables.
The original SVM proposal (Vladimir N. Vapnik and Alexey Ya. Chervonenkis, 1963) has evolved over the years, providing improvements in accuracy and performance.In the recent versions of the algorithm (Vapnik 1995(Vapnik , 2000)), data points in a training sample are efficiently mapped into a hyperplane, or a set of hyperplanes.The training sample is often defined in a finite-dimensional problem space, whereas the hyperplanes are defined in a high-or infinitedimensional space to make their separation easier.The hyperplanes are defined as an orthogonal, and therefore minimal, set of vectors.More specifically, the dot product of two training data points with a vector in that space is a constant value.The hyperplanes achieve the largest functional margin to improve the accuracy of the model, as this reflects greater distances that separate the classes.The SVM algorithms are popular; they have been used to generate accurate prediction models in a wide variety of domains.
The original ANN proposal (Venables & Ripley 2002) has also been improved over time; they are a family of algorithms based on structures that are inspired by biological neural networks, such as the human brain.These structures consist of multiple layers, including an input layer, one or more hidden layers, and an output layer.Each layer is designed to recognize a specific element in the data, then propagate the result to the next layer.In combination, the layers can learn to recognize complex features in the data.A key advantage of ANNs is that the network can actually learn from observing data sets and become more accurate.In this way, ANN is used as a random function approximation tool.However, they require large training sets and have high computational demands.
Numerous software packages are available for the RF, SVM, and ANN algorithms.The following have been selected for this work.For the RF algorithm, the randomForest package (Liaw & Wiener 2002) in R3 (R version 3.6.1,R Core Team 2019) is used to fit the random forests.For the SVM algorithm, the e1071 package (Meyer et al. 2019) is used to build the predictive model in the base R installation.For the ANN algorithm, the nnet packages (Venables & Ripley 2002) in R are used to create the predictive network.In addition, the accuracy of the models is calculated using the classAgreement() function in the e1071 package.The accuracies of the different parameter combinations in each of the SML algorithms (SVM, RF, and ANN ) are computed.The highest prediction accuracies for different combinations of parameters in each of the SML algorithms are illustrated in Figure 1.In the figure, the RF is represented with a red solid line + empty square; SVM is represented with a blue dashed line + empty circle, and ANN is represented with a green dotted-dashed line + star.As the number of parameters increases, the accuracy gradually reaches its maximum.Here, with eight, 10, or 12 parameters combinations (see Table 2), the accuracy of the SVM, RF, or ANN reaches its maximum, respectively.For the ANN algorithm, there is one combination with 12 parameters achieving a maximum accuracy (accuracy=0.918).For the RF algorithm, there is one combination of 10 parameters achieving a maximum accuracy (accuracy=0.929).For the SVM algorithm, there are 15 combinations with 8 parameters that achieve a maximum accuracy (accuracy=0.918)(seeTable 2).These results reveal that there are different combinations with the same number of parameters achieving a maximum accuracy in SVM algorithm.When more parameters are applied, the accuracy begins to decline (e.g., 12, 10, or 17 parameters combination in ANN, RF, or SVM).These results are inconsistent with the conclusions of our previous work (Paper I) that showed more parameters and more accuracy.
According to the optimal combination of parameters (eight, 10, or 12 parameters in SVM, RF, or ANN), an optimized prediction model is obtained from each of the SML algorithms (e.g., SVM, RF, or ANN).These optimized models are used to predict whether a BCU belongs to the BL Lacs or the FSRQs (see Table 2 and Table 3).Using the model generated by the RF algorithm, 835 BL Lac style candidates and 477 FSRQ style candidates are obtained from an optimal combination of 10 parameters (see Table 2) with a maximum accuracy of 0.929.Using the model generated by the ANN algorithm, there are one combination with 12 parameters (see Table 2) with a maximum accuracy of 0.918), where, 868 BCU sources are diagnosed as BL Lac type candidates.The remaining 444 BCU sources are diagnosed as FSRQ type candidates.However, for the SVM SML algorithm, there are 15 combinations with 8 parameters (see Table 2) with a maximum accuracy of 0.918).These combinations result in the diagnosis of 858,859,817,816,812,816,817,817,815,817,819,817,816,818, or 819 BL Lac style candidates, respectively (see Table 2).The remaining 454, 453, 495, 496, 500, 496, 495, 495, 497, 495, 493, 495, 496, 494, or 493 BCUs are diagnosed as FSRQ style candidates.
A better predictions can be obtained by applying various algorithms simultaneously (see Paper I).Based on the optimal combinations of the least number of parameters (e.g., eight parameters in SVM), the combined the results of RF and ANN and one of SVM (with boldface letter, see Table 2) demonstrate 724 BL Lac type candidates and 332 FSRQ type candidates are consistently predicted.However, there are 256 uncertain type BCUs (unks), for which there are inconsistent results.

DISCUSSIONS AND CONCLUSIONS
The potential classification of Fermi BCUs is investigated in this work using three established SML algorithms (RF, SVM and ANN ); the optimal combination of parameters is sought.Based on the 4FGL catalog, 3137 Fermi blazars with 23 parameters are selected from the 4FGL fits table (see Section 2, Section 4, Table 1).Using the algorithms, optimized predictive models are generated and the classification of BCUs based on their directly observable gamma-ray properties is conducted.For the 23 parameters, all of the possible combinations of the parameters are considered.We find that the previously reported trend of using more parameters resulting in more accuracy is not consistently presented.Combinations of eight (with least parameters), 12, or 10 parameters in SVM, RF, or ANN algorithms, respectively, achieve the highest accuracies (accuracy ≃ 91.8%, or ≃ 92.9%).Based on the optimal combination of parameters, the combined the results (indicated with boldface, see Table 2) of these three algorithms result in the diagnosis of 724 BL Lac type candidates and 332 FSRQ type candidates.However, there are 256 uncertain type BCUs forecast.The different physical origins of γ-ray emissions in BL Lacs and FSRQs may account for their distinct gammaray properties (e.g., Bhattacharya et al. 2016;Fan et al. 2016b;Yang et al. 2018;Boula et al. 2019).For BL Lacs, the γ-ray emission is generally believed to originate from a pure synchrotron self-Compton (SSC) process (e.g., Mastichiadis & Kirk 1997;Krawczynski et al. 2004;Zhang et al. 2014), while for FSRQs, the γ-ray emission is generally believed to originate from the SSC + EC (external-Compton) radiation process (e.g., Sambruna et al. 1999;Böttcher & Chiang 2002;Yan et al. 2014;Ghisellini et al. 2011;Chen & Bai 2011;Zheng et al. 2017).This indicates the FSRQ jet has a complex external physical environment, which may lead to complex physical properties in Fermi energy bands (e.g., see Kang et al. 2019 for similar discussions).For instance, the photon spectral index of FSRQs is greater than that the index of BL Lacs (see Abdo et al. 2010b;Ackermann et al. 2011Ackermann et al. , 2015;; The Fermi-LAT collaboration 2019a,b) which may be the result of the spectrum being superposed with other spectra components for the FSRQs (e.g., Zheng & Kang 2013;Kang et al. 2014;Zheng et al. 2016;Kang et al. 2016;Kang 2017).Therefore, the Fermi band of FSRQs, located at the intersection of both the SSC component and the EC component, may result in more complex observational features (e.g., a hard spectrum, Abdo et al. 2009Abdo et al. , 2010b;;Ackermann et al. 2011Ackermann et al. , 2015)).At a deeper level, intrinsically different physical origins between the FSRQs and BL Lacs need further investigation.For instance, research directions include the different accretion models (e.g., see Wang et al. 2002;Cao 2003;Ghisellini et al. 2009;Xu et al. 2009;Sbarrato et al. 2014;Chen et al. 2015;Chen 2018;Gardner & Done 2018 for more details and reference therein), where FSRQs have a standard cold accretion disk and BL Lacs have an advection-dominated accretion flow (ADAF; Yuan & Narayan 2014); the mass accretion rate on the central black hole (e.g., see Boula et al. 2019); and the mass accretion rate and the magnetic field strength (e.g., see Mondal & Mukhopadhyay 2019).
According to the results for the different combinations of parameters, although the prediction accuracies are the same, for different algorithms the optimal combination of parameters is distinct (see Table 4 that shows the number of combinations with the highest accuracies in each algorithm).For instance, 12, 10, and 8 parameter combinations (with the least number of parameters) that achieve the highest accuracy are obtained from ANN, RF, and SVM algorithms, respectively (see Table 2 and 4).The predictive results are also different.With the predictive model generated by the RF algorithm, 835 BL Lac and 477 FSRQ candidates are diagnosed with a combination of 10 parameters.With the predictive model generated by the ANN algorithm, 868 BL Lac and 444 FSRQ candidates are diagnosed with a combination of seven parameters.With the predictive model generated by the SVM algorithm,858,859,817,816,812,816,817,817,815,817,819,817,816,818,or 819 BL Lac style candidates and 454,453,495,496,500,496,495,495,497,495,493,495,496,494, or 493 FSRQ candidates are diagnosed with a combination of 11 parameters.We also should note that for the SVM algorithm, there are 15 combinations with 8 parameters that achieve the same maximum accuracy (accuracy=0.918).The diagnosis results are not consistent for different combinations with the same number of parameters and the same accuracy.For instance, in the SVM algorithm, the 15 combinations with 8 parameters that achieve the same highest accuracy (accuracy=0.918),result in different forecasts: 858,859,817,816,812,816,817,817,815,817,819,817,816,818,or 819 BL Lac and 454,453,495,496,500,496,495,495,497,495,493,495,496,494, or 493 FSRQ candidates, respectively (see Table 2).When we combine the results of these 454, 453, 495, 496, 500, 496, 495, 495, 497, 495, 493, 495, 496, 494, or 493 combinations, 795 BL Lac and 432 FSRQ candidates are diagnosed; 85 have inconsistent diagnoses.The first two predictions: 858, or 859 BL Lac and 454, 453 FSRQ candidates (see Table 2), are slightly different from the predictions of the remaining 13 groups.When we combine the results of the 13 combinations (ignore these two groups), 811 BL Lac and 491 FSRQ candidates are diagnosed; only 11 have inconsistent diagnoses.When we combine the results of the RF and ANN algorithms, one of the SVMs (one of these 13 combinations with italic boldface letter, see Table 2) demonstrate 728 BL Lac type candidates and 352 FSRQ type candidates are consistently predicted.However, there are 232 uncertain type BCUs, for which there are inconsistent results.Based on the results of this work, a single optimal combination of parameters is not revealed (see Table 4).Although, the maximum accuracy of RF algorithms is relatively higher than that of the SVM and ANN algorithms; the number of the optimal parameters and combinations in RF and ANN algorithms is relatively less than that of the SVM algorithm.This implies the interactions among different parameters and/or combinations and different algorithms may be leading to different diagnostic results.Therefore, the challenging questions on how to find the optimal parameters and combinations and how to explore the optimal algorithm remain open.
When comparing the 256 uncertain type BCUs with the 724 BL Las and the 332 FSRQs predicted in the work, we find that the FSRQs have a larger median value and average value for most parameters (e.g., "PL Index", "LP Flux Density", or "Flux Band2", etc. see Table 5) compared with BL Las.The remaining parameters ("Pivot Energy", "nuFnu Band7", "Flux Band7", "nuFnu Band6", "Flux Band6") show a smaller median value and average value for the FSRQs.The median value and average value of the uncertain type BCUs can be considered in two cases.The first case has smaller values than those of the FSRQ and greater than those of the BL Las (M fsrq > M unk > M bll , and A fsrq > A unk > A bll ).The second case has values greater than those of the FSRQ and smaller than those of the BL Las (M fsrq < M unk < M bll , and A fsrq < A unk < A bll , see Table 5).The median value and average value of the uncertain type BCUs always locate between that of FSRQ and BL Las.In addition, from the scatter plots (e.g., see Figure 2), we can easily find that these unk sources are usually located in overlapping regions of the FSRQ and BLLac distributions.This may make these unk sources difficult to distinguish and make them uncertain.Cross-matching the 4FGL predictions (1312 BCUs) obtained from the combined the results of the three algorithms used in the work and the 3FGL predictions (400 BCUs), we obtained 219 common objects (see Table 6).Among the common objects, 119 BL Lacs, 66 FSRQs candidates, and 34 unks (still no explicit predictions) were predicted in this work (4FGL prediction).In addition, 113 BL Lacs, 47 FSRQs candidates, and 59 unks were previously predicted in the 3FGL work for the results of combining four algorithms (paper I).For the 113 BL Lacs candidates predicted in the 3FGL work (paper I), most of which (97 sources, approximately 85.84%, are consistent with the results of the 3FGL) were also predicted as BL Lacs candidates, only three sources are predicted as FSRQs candidates; 13 sources are remain without a clear prediction in the work.For the 47 FSRQs candidates reported in the 3FGL work (paper I), most of which (40 sources, approximately 85.84%, are consistent with the results of 3FGL) were also predicted as FSRQs candidates, non-source is predicted as BL Lac candidates, and seven sources are remain without a clear prediction in the work.Overall, approximately 85% of the predicted results are consistent with the results of the 3FGL, which shows that the 3FGL and 4FGL classifications are basically consistent.Also, for the 60 unks predicted in the 3FGL work (paper I), there are 22 and 23 sources are classified as BL Lac and FSRQ candidates, respectively; there are still 14 sources are not provided with a clear prediction in the work.In addition, we also compared the results of each algorithm of the previous work (Paper I); these results are reported in Table 6.
Cross-matching the 3FGL predictions, the BL Lacs and FSRQs have been spectrally verified in the 4FGL (694 FSRQs and 1131 BL Lacs), 140 sources are verified as BL Lacs and 19 sources are verified as FSRQs in the 4FGL catalog (see Table 6); 125 BL Lacs, 13 FSRQs candidates, and 21 unks were predicted in the 3FGL work for the results of combining four algorithms (paper I).The 140 BL Lacs have been spectrally verified in the 4FGL, most of which (120 sources, approximately 85.71% (120/140), are consistent with the 3FGL predictions) were also predicted as BL Lacs candidates, only 3 sources are predicted as FSRQs candidates; 17 sources are without a clear prediction in 3FGL work.For the 19 FSRQs that have been spectrally verified in the 4FGL, approximately 1/2 sources (10 sources, are consistent with the 3FGL predictions) were also predicted as FSRQs candidates, 5 sources are predicted as BL Lac candidates; 4 sources remain without a clear prediction in the 3FGL work, respectively.The prediction accuracy of BL Lacs is relatively high, while the prediction accuracy of FSRQs is lower.For the results of each algorithm of the 3FGL work (Paper I), similar results are also reported (see Table 6).However, the predictions obtained by applying various methods exhibit a higher prediction accuracy (e.g., 120/125, approximately 96% for BL Lacs' prediction; 10/13, approximately 76.9% for FSRQs' prediction).These results show that our classification prediction approach is very reliable.
In addition, we also should note that all of the default settings (e.g., the probabilities: p > 0.5 in each classifier to consider a correct classification, see Table 3) for each of the three classification functions (e.g., randomF orest(), svm() and nnet() function) are used in Section 4. For each different classification method, choosing the calculation model and setting each parameter in the fitting function can also affect the predictive models, accuracy, and results (e.g., see the discussions in Kang et al. 2019).The selections of the appropriate parameter settings need further investigation; this is beyond the scope of the current work.Also, we should note that approximately 4/5 and 1/5 of the known classification blazars are assigned as the training data set and validation data set, respectively.The choice of 4/5 and 1/5 is a bit arbitrary.The predictive accuracy and results may be affected by the training data set and validation data set.This issue has been discussed in Paper I (see page 8 in Kang et al. 2019 for the details and discussions).Similar or identical parameters (see Table 1), such as PL Index, LP Index and PLEC Index; or PL Flux Density, LP Flux Density and PLEC Flux Density; or Variability Index and Variability2 Index, are also used in the SML at the same time, which should be cautioned.In addition, it must be highlighted that, in this work, the sample selection methods used (Kang et al. 2018(Kang et al. , 2019)), may affect the source distributions and the results of the analysis.However, this work provides some clues in applying SML algorithms to evaluate the classification of Fermi BCUs.Finally, it should be noted that in this work, only a subset of the parameters (23 out of 33 parameters) have been selected to search for the optimal parameter combination based on the preliminary version of the data (see The Fermi-LAT collaboration 2019a), which may cause bias.More variables and the final complete sample are needed to further test and address the issue.
, consisting of 3137 Fermi sources (blazars) with 694 FSRQs, 1131 BL Lacs, and 1312 BCUs, are divided into three samples: training, validation, and forecast.The FSRQs and the BL Lacs have known optical classifications.Approximately 4/5 of these are randomly assigned to the training sample; the random seed value of 123 is used.The remaining data (e.g., approximately 1/5) are used as the validation dataset for different combinations of parameters in each of the SML algorithms (SVM, RF, and ANN).In summary the training dataset includes 1460 blazars (905 BL Lacs, 555 FSRQs), and the validation dataset contains 365 blazars (SVM, RF, and ANN ).The forecast dataset consists of the 1312 BCUs.The training, validation, and forecast samples are used by the SVM, RF, and ANN algorithms.The default settings for each of the three classification functions (svm(), randomF orest(), and nnet())are used to simplify the calculations.After the predictive models are generated and assessed; an effective predictive model is used to forecast whether a BCU belongs to the FRSQ or the BL Lac class based on its predictor variables.The main steps to accomplish this in the R platform are publicly available 4 .

Figure 1 .
Figure 1.(Color online) Highest accuracy for the different number of combinations of parameters in each SML algorithm: Random Forests (red solid line + empty square), Support Vector Machines (blue dashed line + empty circle), and Artificial Neural Networks (green dotted-dashed line + star).

Figure 2 .
Figure 2. (Color online) Classification scatter diagram for a subset of the parameters (PL Index, Pivot Energy and PL Flux Density).Only a portion is shown here for guidance regarding its form and content.The red empty squares, black empty triangles, and blue solid circles indicate FSRQs, BL Lacs and uncertain type BCUs respectively.

Table 1 .
The result of two sample test for 1131 BL Lacs and 694 FSRQs

Table 2 .
The test accuracy, predict results , and parameters for the optimal combinations Note-The classifiers are presented in Column 1 and the number of parameters for the optimal combination (with least parameters) are presented in Column 2. The number of BL Lacs and FSRQs predicted by a supervised classifier (using SML techniques) for the BCUs (predicted dataset) are presented in Columns 3 and 4. The highest accuracies of each classifier are presented in Column 5.The labels of the parameters are presented in Columns 6-17; these correspond to the labels in Table1, Column 1.Here, the least parameters for the optimal combination is shown for guidance regarding its form and content.The complete results are available elsewhere in a machine readable format (see Table2 combination.xlsx).

Table 3 .
The classification of F ermi BCUs Note-The 4FGL names are presented in Column 1 and the counterpart names are presented in Column 2. The optical classes are presented in Column 3 (e.g., BCU as reported in The Fermi-LAT collaboration 2019a).The classification of the F ermi BCUs using the Random Forest (CRF), Support Vector Machines (CSVM), and Artificial neural networks (CANN) are presented in Columns 4, 7', and 10, where "bll"indicates BL Lac and "fsrq"indicates FSRQ.The probabilities PBi,X and PF i,X that a BCU i belongs to the BL Lac or FSRQ classes from the RF method are shown in Columns 5 and 6; from the SVM method are shown in Columns 8 and 9; and from the ANN method are shown in Columns 11 and 12; respectively.The complete results are available elsewhere in machine readable format Table3is published in its entirety in the machine-readable format (see Table3 classification.xlsx).A portion is shown here for guidance regarding its form and content.

Table 4 .
The number of the optimal parameters and combinations Classifier 8 par 9 par 10 par 11 par 12 par 15 par 16 par 17 par Note-The classifiers are presented in Column 1.The number of combinations with the highest accuracies in each algorithm for 7, 9-16 parameters are presented in Columns 2-10.(see the machine-readable format of Table2).

Table 5 .
Median and mean of 724 FSRQ, 332 BL Lacs and 256 unks Note-Column 1 presents the parameter labels in the sample.Column 2 lists the selected parameters.The median value of FSRQs (M fsrq ), uncertain type BCUs (M unk ) and BL Lacs (M bll ) are presented in Columns 3, 4, and 5, respectively.The average value of FSRQs (A fsrq ), uncertain type BCUs (A unk ) and BL Lacs (A bll ) are presented in Columns 6, 7, and 8, respectively.

Table 6 .
Comparison of 3FGL predictions and 4FGL resultsNote-The classifiers and classes are presented in Column 1 and 2. Column 3-6 shows the results of comparison of the 3FGL predictions and the 4FGL predictions for common objects.The results of comparison of the 3FGL predictions, BL Lacs and FSRQs verified in the 4FGL are presented in Columns 7-9.Where, a and b indicate the number of the 3FGL predictions; c indicates the number of 4FGL sources in the cross-matching the 3FGL predictions and 4FGL predictions, or the sources verified in the 4FGL.