Comparison of Machine Learning Algorithms for Natural Gas Identification with Mixed Potential Electrochemical Sensor Arrays

Mixed-potential electrochemical sensor arrays consisting of indium tin oxide (ITO), La0.87Sr0.13CrO3, Au, and Pt electrodes can detect the leaks from natural gas infrastructure. Algorithms are needed to correctly identify natural gas sources from background natural and anthropogenic sources such as wetlands or agriculture. We report for the first time a comparison of several machine learning methods for mixture identification in the context of natural gas emissions monitoring by mixed potential sensor arrays. Random Forest, Artificial Neural Network, and Nearest Neighbor methods successfully classified air mixtures containing only CH4, two types of natural gas simulants, and CH4+NH3 with >98% identification accuracy. The model complexity of these methods were optimized and the degree of robustness against overfitting was determined. Finally, these methods are benchmarked on both desktop PC and single-board computer hardware to simulate their application in a portable internet-of-things sensor package. The combined results show that the random forest method is the preferred method for mixture identification with its high accuracy (>98%), robustness against overfitting with increasing model complexity, and had less than 10 ms training time and less than 0.1 ms inference time on single-board computer hardware.

Natural gas infrastructure is widely deployed across the United States with over 305,000 miles of transport pipeline and hundreds of facilities dedicated solely to the storage of natural gas. 1 These pipelines contain an estimated 630,000 leaks, emitting over 0.69 teragrams of methane per year 2 and accounting for upwards of 2 billion dollars of economic impact annually. 3 Because of the significant environmental and economic impact, it is crucial to be able to both detect and locate these leaks in natural gas infrastructure. Technologies available for methane emissions monitoring currently include IR optical sensors, photoacoustic sensors, semiconductor sensors, cavity ringdown sensors, catalytic sensors, and mixed potential electrochemical sensor arrays. Each sensor type yields advantages and disadvantages in cost, accuracy, and selectivity among other factors. Development of a low-cost, robust sensor with high natural gas selectivity is critical for emissions detection.
Methods of detection for sensors include IR optical devices, metal oxide semiconductors, and mixed potential sensors. IR optical sensors using tunable diode laser absorption spectroscopy (TDLAS), non-dispersive infrared (NDIR), and photoacoustic sensing present a selective approach to natural gas detection with detection limits at the ppm level. 4,5 Previous works in methane 6 and liquid natural gas detection 7 demonstrate the efficacy of specifically using absorption at near and mid-infrared wavelengths. 8 However, challenges for deploying IR optical sensors in the field include their high cost, limited speciation, and requirement that any optical cavities be kept clean of debris. Extremely long pathlength cavity ringdown IR spectrometers are also being developed for atmospheric measurements of methane 9,10 as well as natural gas leak monitoring at natural gas production and processing facilities. 11 Metal oxide semiconductor sensors also provide an option for gas sensing. 12 Metal oxide semiconductor sensors have been used for methane and other hydrocarbon sensing with specific applications in detecting ethane and identifying leaks in natural gas pipelines. 13 Semiconductor sensors are much less expensive and easier to produce than optical sensors while also being sensitive to the gases of interest. Ongoing attempts to address the main challenge semiconductor sensor drift show promise. 14 Mixed Potential Electrochemical Sensors (MPES) bring both the robustness and selectivity of optical sensors as well as the lower costs and ease of manufacturing characteristic of semiconductor sensors. 15 MPES devices consist of dissimilar electrodes embedded in a solidelectrolyte material where oxidation and reduction rates simultaneously occur at the triple-phase interface. 15 Differing rates of oxidation and reduction reactions at each electrode due to the catalytic activity of the electrode materials determine the mixed potential on each electrode. [15][16][17] The difference in the mixed potential between electrodes is measured by a voltmeter as a sensing parameter. By varying the materials used in each electrode, MPES arrays can be configured for multi-gas selectivity. Many metals and metal oxides have been tested as electrode materials for increasing the sensitivity and selectivity of gases. Indium tin oxide (ITO), Au, La 0.8 Sr 0.2 CrO 3 (LSC), and Pt have successfully been proven to detect methane, heavy hydrocarbons, ammonia, and hydrogen, [18][19][20][21][22][23] while yttria-stabilized zirconia is typically used as an electrolyte. 24 These materials were chosen as the four electrodes for the sensor array, and YSZ was chosen as the electrolyte for this study. Similar MPES devices have been used for monitoring automotive emissions. 25,26 The application of machine learning (ML) techniques to the chemical sensor field open new avenues to take advantage of parallel computing power, edge computing, and big datasets. [27][28][29] Supervised machine learning algorithms attempt to take a set of features such as sensor signals and map them to either a category (classification) or a value (regression). An example of the former is discrimination of a natural gas emission from a wetlands emission and an example of the latter would be predicting the concentration of CH 4 in the air. Prior applications of machine learning to natural gas detection have seen progress in increasing the accuracy of optical sensors. [30][31][32] Additionally, machine learning optimizations to signal decoding have the potential to enable the deployment of smart sensor arrays for natural gas infrastructure monitoring. 33,34 Combining ML algorithms with unmanned aerial and ground vehicles has also been shown as a promising method to survey natural gas infrastructure for leaks where large areas in remote locations need to be covered. [35][36][37] Other surveys have compared different machine learning techniques z E-mail: lktsui@unm.edu applied to other types of gas sensors 38,39 and specific studies such as Tsitron et al. 40 demonstrate the application of a single machine learning technique such as Bayesian decoding to MPES devices. We have previously demonstrated the effectiveness of artificial neural networks for gas mixture identification in the context of natural gas emissions 18 and automotive emissions. 21,22 Wang et al. also showed that taking readings from an MPES device across a range of polarizations could accurately identify nine VOC species on a single sensor with random forest and gradient boosting methods. 41 Field deployable sensors typically rely on low-power internet of things hardware, and several platforms exist which can perform both data collection and analysis simultaneously. 42 In the room temperature sensing space, Huang et al. showed that NH 3 and PH 3 could be detected at the 100 ppb level with 100% and 78% accuracy respectively using exfoliated graphene amperometric sensors. 43 ML methods have also shown promise when applied to the design of the sensors. 44 Wang et al. demonstrated that machine learning using Random Forest and three gradient boosting techniques could identify 13 materials for MPES electrodes selective for NO 2 detection out of a total of 8000 materials. 45 The field of edge computing concerns the development of methods to process data close to collection. Machine learning methods in edge computing will have to deal with both limited computational power and intermittent or high latency network connections. 46,47 Machine learning methods which can be both trained and used on low-power hardware are needed. To date, no benchmark of methods for the accuracy of mixture identification algorithms MPES devices has taken into consideration the required processing power. If an algorithm can be shown to be both accurate and computationally efficient, that method can easily be integrated into low power edge computing devices.
We report a comparison of six methods for the identification of simulated gas mixtures which span a range of computational complexity and algorithmic approaches: Logistic Regression, Naïve Bayes, Nearest Neighbor, Support Vector Machine (SVM), random forest, and Artificial Neural Network (ANN). Their accuracies were measured on datasets acquired between 450°C-600°C. Random Forest, Nearest Neighbor, and Artificial Neural Network methods were then further optimized to achieve at least 98% test accuracy. Finally, these methods were benchmarked on desktop PC and system-on-board hardware for training and inference processing speed.

Experimental
Data collection.-Sensors were manufactured on ceria stabilized zirconia (CSZ) substrates. The substrates were prepared in the following manner. First 75 wt% zirconia and 25 wt% ceria powders were ball milled in isopropanol for 12 h, then sintered at 1175°C for 24 h. The sintered powder was then ball milled for another 12 h, this powder is then passed through a 200 μm sieve. The paste used for printing is prepared by mixing a dispersant (polyvinyl alcohol), binder (ascorbic acid), plasticizer (polyethylene glycol), and water in a Thinky centrifugal mixer for 60 s at 2000 rpm. The powder is then added in three steps, with 60 s of mixing at 2000 rpm after each addition, for a solid loading of 72.5 wt%. The paste was then transferred to a syringe for direct write extrusion printing. A Hyrel System 30M was used for the printing of the substrate, Pt electrode, and Pt conducting leads for the sensing electrodes. Indium Tin Oxide (ITO), La 0.2 Sr 0.13 CrO 3 (LSC), and Au electrodes were added and fired according to the process outlined in Halley et al. 18 A photograph of a completed sensor appear in Fig. 1.
The sensors were bonded with Ag paste to Ag wires that were fed through a 4-bore alumina tube. Sensor testing was carried out in a tube furnace set to 450, 500, 550, and 600°C. Gas concentrations were controlled with an Environics Series 2000 gas mixer. The oxygen content was fixed at 21% and a balance gas of N 2 was used for all tests. Gas cylinders containing 1% CH 4 , 1% CH 4 natural gas high ethane simulant, 1% CH 4 natural gas low ethane simulant, 500 ppm NH 3 , and 500 ppm C 2 H 6 were used. All test gas cylinders contained balance N 2 . The composition of the natural gas simulant mixtures are provided in Supplementary Information, Table SI. The concentration of CH 4 tested was in a range of 300-2500 ppm. The ethane concentration was modulated in the two natural gas mixes by the mixing of additional C 2 H 6 . CH 4 +NH 3 binary mixtures with NH 3 content ranging from 8-325 ppm were also tested to simulate emissions from agricultural sources. The NH 3 :CH 4 ratio varied between 0.03 and 1.08. We evaluated four different mixtures described in Table I that serve as the labels for mixture classification: Machine learning technique comparison.-Artificial neural networks were implemented in Python 3.7 with NumPy (version 1.18.1) and Tensorflow (version 2.1.0). Extreme Gradient Boost was implemented using the dmlc XGBoost library. 48 All other methods are implemented in Python 3.7 with NumPy (version 1.18.1), and Scikit-Learn (version 0.24.0). 49 The gases are categorized according to the four mixture types in Table I    From these groupings, 80% of the group is randomly selected to serve as "training" data while the remaining 20% of each group is selected to be "testing" data. Each train-test trial consists of training the given model on the training data and testing the trained model on the testing data. From the 20 trials, the results are accumulated in a 4x4 confusion matrix. Learning curves were calculated by first performing an 80%:20% train:test split, and then randomly selecting between 9-188 points from the training dataset for training. This process was iterated 100 times for each algorithm. The methods and default starting parameters used are as follows: • While this is not an exhaustive list of possible classification algorithms, the selected methods cover a range of approaches including linear models (Logistic Regression), ensemble methods (Random Forest), and black-box methods (Artificial Neural Networks). These methods also cover a range of complexities from computationally simple (Logistic Regression) to intensive (Artificial Neural Networks). The best three methods were then down-selected for further optimization to evaluate complexity. A description of the operating principle of these methods is provided in Supplementary Information under "Operating Principle of Machine Learning Methods." Processing time benchmarking was performed on a desktop PC and a single-board computer to simulate the operation of these algorithms on low-power hardware. The custom-built desktop PC was running an Advanced Micro Devices, Inc. Ryzen 7 3700X (3.6 GHz) processor, the Ubuntu 22.04.1 LTS operating system, 64 GB of RAM, and an 850 W power supply. The single-board computer was a Raspberry Pi 3 Model B Rev 1.2 running an ARM Cortex-A53 (900 MHz) processor, the Raspberry Pi OS 11 operating system, 1 GB of RAM, and USB power connection drawing at most 4.1 W. The training and inference time for the three optimized identification methods were recorded and averaged over 10 iterations. Inference time is defined as the time needed for the model to predict one label from one set of three voltage signals.  Table I. These electrode materials were chosen for their selectivity. ITO vs Pt is sensitive towards CH 4 , 23 LSC vs Pt is sensitive to heavy hydrocarbons, 21 and Au vs Pt is sensitive to NH 3 . 22 The three electrode pair signals are expected to positively identify natural gas by sensing the simultaneous presence of CH 4 with heavy hydrocarbons in the absence of NH 3 . All three sensor signals become more negative with increasing CH 4 concentration. A stronger LSC vs Pt response is observed with the increasing heavy hydrocarbon concentration of the two NG mixes. An enhanced Au vs Pt and a suppressed ITO vs Pt signal is observed with the NH 3 +CH 4 binary mixtures. Figure 3 displays the confusion matrices for each model using the starting parameters for each model listed in Section 2.2. The horizontal axis represents the actual label of samples in the test set while the vertical axis represents the label the model predicted for the samples in the test set. Each entry in the heat plot represents the percentage of the corresponding actual label that the model identified as the corresponding predicted label, with the principal diagonal indicating the model correctly predicted those proportions of the data. As shown in Table II, all methods were able to achieve at least 90% test accuracy at 550°C with the best algorithms being RF, ANN, and NN able to achieve >98% test accuracy. These three methods are further optimized in the next section. This optimal temperature where identification accuracy is maximized occurs due to an optimum in sensitivity in the multi-electrode sensors where the temperature is high enough to readily catalyze the CH 4 oxidation reaction, but low enough that the presence of heavier hydrocarbon or NH 3 oxidation can be readily detected by the mixed potential difference created by the choice of electrode materials. 18 Learning curves for these algorithms are presented in Supplementary  Information, Fig. S1. Logistic Regression and SVM algorithms show a linear increase in test accuracy from 9-188 training data points. Random Forest and Nearest Neighbor methods show a rapid increase in test accuracy from 9-50 training data points followed by a shallow slope linear trend converging on their final test accuracy. Finally, Naïve Bayes and ANNs reach a plateau at 50 training data points and no further improvement in test accuracy when the training dataset size is increased to a size of 188.

Results and Discussion
Optimization of RF, NN, and single hidden layer ANN algorithms was carried out to identify the influence of model complexity on test accuracy. Increasing the complexity of models either hits a point where the model is no longer able to generalize (overfitting) or a point of no further improvement is achieved. Figure 4 shows the optimization of these three best performing algorithms to data measured at 550°C. The complexity parameters are the number of trees, neighbors, and hidden layers for RF, NN, and ANN algorithms respectively. RF has a 98% average accuracy that saturates at about 10 trees. Testing out to 1000 decision trees neither improved nor decreased the accuracy indicating that this algorithm is robust against overfitting. Alternate ensemble methods in the form of the gradient boosting decision tree and extreme gradient boost 50 were evaluated, but showed no improvement in performance compared with Random Forests in the range of 10-1000 estimators ( Supplementary Information, Fig. S2). The NN algorithm starts with >98% accuracy and declines as the number of neighbors increases. This decrease in accuracy is a result of closely spaced mixture categories where increasing the number of neighbors will incorrectly select neighbors that are of the wrong gas mixture.
Finally, the ANN algorithm shows 98% accuracy after 10 hidden layer neurons are used for the tanh and rectified linear unit (reLU) activation functions, but the sigmoidal activation function declines after 10 hidden layer neurons. This decline may result from the increased parameter size of the larger hidden layer size reaching a local minimum with a larger training error and there is insufficient information in the dataset size to guide it to a better optimum. This  does not appear to be overfitting which would show up as a decrease in training error and an increase in test error. Instead, both the test error and training error decline simultaneously. All three algorithms can achieve the target objectives of >98% accuracy, but either RF or NN will be the preferred algorithm due to their lower computational cost compared to ANNs. Results for benchmarking of processing times are shown in Table III. The model complexity was chosen and set at a level where additional complexity did not improve identification accuracy. The RF method was trained with 10 decision trees, the NN method was trained on a 5-neighbor model, and the ANNs were trained with 10 hidden layers using the tanh activation function. All training runs were trained on 188 data points. Comparison of the training times on desktop PC hardware shows that ANNs are 3 orders of magnitude slower to train compared with an optimized Random Forest and 5 orders of magnitude slower compared with the Nearest Neighbor method. On the Raspberry Pi single-board computer, the training time for both Random Forest and Nearest Neighbor datasets is <0.1 s, while ANNs require close to 2 min to complete training. The inference time for a single 3-voltage data point is also shown in Table III. The inference time for all three methods is <5 ms, with the random forest and nearest neighbor methods being <0.1 ms per data point. All three methods are sufficiently fast for inference given response times for MPES devices are at 1 ms. However, the longer training time for ANNs makes it challenging to deploy if the algorithm needs to be frequently recalibrated.

Conclusions
We evaluated methods to identify mixtures of CH 4 , two natural gas simulant mixes, and CH 4 +NH 3 at CH 4 concentrations between 300-2500 ppm from sensor signals collected from a 4-electrode MPES sensor array. We demonstrated that data collected at 550°C resulted in optimal identification accuracy. Random Forests, Nearest Neighbors, and Artificial Neural Networks were optimized, and all could achieve >98% test accuracy. Random Forests were found to be robust against overfitting and had the largest model complexity window to maintain high test accuracy. Finally, the performance of these algorithms was evaluated on single-board computer hardware. Random Forest and Nearest Neighbor methods were found to have <0.1 s training time and <0.1 ms inference time. With a combined high accuracy, robustness against overfitting, fast training speed, and fast interference speed on low power hardware, we conclude that Random Forests would be the best performing mixture identification method for natural gas emissions discrimination in a low-power edge computing context.