Data-Driven Modeling of Smartphone-Based Electrochemiluminescence Sensor Data Using Artificial Intelligence

Understanding relationships among multimodal data extracted from a smartphone-based electrochemiluminescence (ECL) sensor is crucial for the development of low-cost point-of-care diagnostic devices. In this work, artificial intelligence (AI) algorithms such as random forest (RF) and feedforward neural network (FNN) are used to quantitatively investigate the relationships between the concentration of Ru(bpy)32+ luminophore and its experimentally measured ECL and electrochemical data. A smartphone-based ECL sensor with Ru(bpy)32+/TPrA was developed using disposable screen-printed carbon electrodes. ECL images and amperograms were simultaneously obtained following 1.2-V voltage application. These multimodal data were analyzed by RF and FNN algorithms, which allowed the prediction of Ru(bpy)32+ concentration using multiple key features. High correlation (0.99 and 0.96 for RF and FNN, respectively) between actual and predicted values was achieved in the detection range between 0.02 µM and 2.5 µM. The AI approaches using RF and FNN were capable of directly inferring the concentration of Ru(bpy)32+ using easily observable key features. The results demonstrate that data-driven AI algorithms are effective in analyzing the multimodal ECL sensor data. Therefore, these AI algorithms can be an essential part of the modeling arsenal with successful application in ECL sensor data modeling.


Introduction
Electrochemiluminescence (ECL) is being explored in research ranging from fundamental studies to its application as a platform of light-emitting sensors and an analytical detection method. Because ECL does not requires any external excitation light source, it has the advantage of having ultra-sensitivity and very low background signal. In addition, it allows minimal instrumentation due to the simplicity of voltage application, rapid measurements (only a few seconds), localized light emission (geometric location of light on a working electrode), and cost-effective set-up [1]. These are the inherent advantages of ECL over other light emission-based techniques such as photoluminescence and chemiluminescence [2]. In this context, the smartphone can be an alternative to the expensive traditional instrumentation for ECL sensors such as the photomultiplier tube (PMT). Smartphones are

Sensor Apparatus and Electrodes
Simultaneous measurements of sequences of ECL imaging and amperograms (current vs. time) were carried out using a mobile phone-based ECL sensor apparatus. The sensor design interfaces with a custom compact potentiostat and a mobile phone (Samsung Galaxy S7) with a custom-made app controlling the potentiostat parameters and the phone camera for time synchronization ( Figure  2a). The compact potentiostat used was customized from an open-source potentiostat shield named Rodeostat (designed from the Teensy 3.2 board; IO Rodeo, Pasadena, CA, USA) in a three-electrode set-up. Disposable screen-printed carbon electrodes (DropSens, DRP-110) were used consisting of a carbon working electrode (4 mm diameter), a carbon ink counter electrode, and a silver reference electrode printed on a flat ceramic card. Figure 2b illustrates the basic operation of the portable potentiostat circuit. The signal and the voltage (in blue letters) are generated through the microcontroller unit (MCU) attached on the board. The MCU is modulated according to a square waveform signal (however, it could also be a sine or triangular waveform) and an input voltage. The signal and the voltage feed the control amplifier, which is a servo amplifier, to adjust the amplitude Mechanistic or first-principles model (

Sensor Apparatus and Electrodes
Simultaneous measurements of sequences of ECL imaging and amperograms (current vs. time) were carried out using a mobile phone-based ECL sensor apparatus. The sensor design interfaces with a custom compact potentiostat and a mobile phone (Samsung Galaxy S7) with a custom-made app controlling the potentiostat parameters and the phone camera for time synchronization (Figure 2a). The compact potentiostat used was customized from an open-source potentiostat shield named Rodeostat (designed from the Teensy 3.2 board; IO Rodeo, Pasadena, CA, USA) in a three-electrode set-up. Disposable screen-printed carbon electrodes (DropSens, DRP-110) were used consisting of a carbon working electrode (4 mm diameter), a carbon ink counter electrode, and a silver reference electrode printed on a flat ceramic card. Figure 2b illustrates the basic operation of the portable potentiostat circuit. The signal and the voltage (in blue letters) are generated through the microcontroller unit (MCU) attached on the board. The MCU is modulated according to a square waveform signal (however, Sensors 2020, 20, 625 4 of 14 it could also be a sine or triangular waveform) and an input voltage. The signal and the voltage feed the control amplifier, which is a servo amplifier, to adjust the amplitude to the desired current applied on the counter electrode. During tests, the electrometer measures the voltage differences between the reference and working electrodes and retro-feeds the control amplifier to keep the voltage at the desired value. The current flowing through the working electrode is measured at the I/E converter, which is a current-to-voltage converter, and it is recorded and displayed as a current vs. time graph. The phone camera was set to pro mode with autofocus mode at ISO 3200, and burst mode was used to collect two-dimensional (2D) ECL image sequences with 8-20 frames per second (FPS). During experiments, the cell phone camera was aligned with the hole of the container to fit the mobile phone camera and placed just above the working electrode. The custom potentiostat was connected with the cell phone on one side and the screen-printed electrodes (SPEs) on the other side. to the desired current applied on the counter electrode. During tests, the electrometer measures the voltage differences between the reference and working electrodes and retro-feeds the control amplifier to keep the voltage at the desired value. The current flowing through the working electrode is measured at the I/E converter, which is a current-to-voltage converter, and it is recorded and displayed as a current vs. time graph. The phone camera was set to pro mode with autofocus mode at ISO 3200, and burst mode was used to collect two-dimensional (2D) ECL image sequences with 8-20 frames per second (FPS). During experiments, the cell phone camera was aligned with the hole of the container to fit the mobile phone camera and placed just above the working electrode. The custom potentiostat was connected with the cell phone on one side and the screen-printed electrodes (SPEs) on the other side.

Electrochemical and ECL Experimental Data Generation
Experimental data generation is a critical step in the construction of AI algorithms. The performance of the AI algorithms depends largely on the quality of the data used in the training step. This study used electrochemical and ECL data from measurements performed with the mobile phone-based ECL sensor for training the AI algorithms.
The procedure for experimental data generation used a forward approach as illustrated in Figure  3a, where the electrochemical and ECL data were determined given a concentration of Ru(bpy) 3 2+ . In this procedure, the ECL sensor explored the chronoamperometry technique (an example of real data is shown in Figure 4), where a square waveform potential was applied to the carbon working electrode with 50 µ L of Ru(bpy) 3 Figure 2. Schematic diagram of (a) the mobile phone-based electrochemiluminescence (ECL) sensor apparatus that mainly comprises (1) a magnifying lens, (2) screen-printed electrodes, (3) a smartphone, (4) a potentiostat circuit, (5) a light-tight container, (6) a Universal Serial Bus (USB) cable, and (7) a cable to the battery or USB port; (b) the basic operation of the portable potentiostat circuit.

Assays
A 1 mM stock solution of Ru(bpy) 2+ 3 in Milli-Q water was diluted to provide sample solutions from 0.02 to 2.5 µM of Ru(bpy) 2+ 3 . Each sample solution was mixed with 20 mM TPrA in 0.1 M PBS, constituting a Ru(bpy) 2+ 3 /TPrA system. The reproducibility and repeatability assessment of this system was demonstrated elsewhere [1]. Measurements were performed at room temperature by dropping 50 µL of Ru(bpy) 2+ 3 /TPrA solution onto the carbon working electrode surface. A waiting time of 10 min was established to create less electrode contact resistance. Then, the ECL reaction was triggered by applying 1.2 V, while simultaneously measuring the ECL emission and the current at the carbon working electrode.

Electrochemical and ECL Experimental Data Generation
Experimental data generation is a critical step in the construction of AI algorithms. The performance of the AI algorithms depends largely on the quality of the data used in the training step. This study used electrochemical and ECL data from measurements performed with the mobile phone-based ECL sensor for training the AI algorithms.
The procedure for experimental data generation used a forward approach as illustrated in Figure 3a, where the electrochemical and ECL data were determined given a concentration of Ru(bpy) 2+ 3 . In this procedure, the ECL sensor explored the chronoamperometry technique (an example of real data is shown in Figure 4), where a square waveform potential was applied to the carbon working electrode with 50 µL of Ru(bpy) 2+ 3 /TPrA sample solution. To simultaneously measure the electrochemical and ECL data for each concentration of Ru(bpy) 2+ 3 , the portable potentiostat was set to apply a potential of 0 V vs. Ag/Ag+ for 1 s, followed by −1.2 V vs. Ag/Ag+ for 1 s, and finally followed by 1.2 V vs. Ag/Ag+ for 1 s (Figure 4a). The potentials 0 V vs. Ag/Ag + and −1.2 V vs. Ag/Ag + were used to stabilize the system while avoiding oxidation of Ru(bpy) 2+ 3 . The potential of 1.2 V vs. Ag/Ag + produced ECL upon concomitant oxidation of Ru(bpy) 2+ 3 and TPrA. Typical transient current and ECL responses recorded over the course of the stabilization and oxidation periods are shown in Figure 4b,c, respectively. Figure 4d,e show the zoom-in view of the shaded area in Figure 4b,c, respectively. Figure 4e also shows the current derivative signal (brown line) corresponding to the current response (blue line). From this data, three key features were identified: the maximum value of the current peak (C maxp ), the minimum derivative value of the current (C mind ), and the decay slope of the ECL intensity (ECL sl ), shown in red letters (Figure 4d,e). It is worth mentioning that the estimated slopes explained the decay of ECL intensities accurately with a coefficient of determination, R 2 , above 0.85 for all measurements. The three key features chosen were the input variables of the data-driven models. The output variable was the concentration of Ru(bpy) 2+ 3 .
Ag/Ag + produced ECL upon concomitant oxidation of Ru(bpy) 3 2+ and TPrA. Typical transient current and ECL responses recorded over the course of the stabilization and oxidation periods are shown in Figure 4b,c, respectively. Figure 4d,e show the zoom-in view of the shaded area in Figure  4b,c, respectively. Figure 4e also shows the current derivative signal (brown line) corresponding to the current response (blue line). From this data, three key features were identified: the maximum value of the current peak (Cmaxp), the minimum derivative value of the current (Cmind), and the decay slope of the ECL intensity (ECLsl), shown in red letters (Figure 4d,e). It is worth mentioning that the estimated slopes explained the decay of ECL intensities accurately with a coefficient of determination, R 2 , above 0.85 for all measurements. The three key features chosen were the input variables of the data-driven models. The output variable was the concentration of Ru(bpy) 3 2+ .
Following the procedure described above, multiple experiments were performed for different concentrations of Ru(bpy) 3 2+ distributed in a range of 0.02 to 2.5 µ M. This range was established based on prior knowledge of the ECL emission for the Ru(bpy) 3 2+ /TPrA system [1]. Experimental profiles for Cmaxp, Cmind, and ECLsl were thereby obtained as a function of concentration of Ru(bpy) 3 2+ .
The goal was to include the data containing the most relevant information about the system in the training data. A routine implemented in the R programming environment was used to interpolate these measurements that showed consistent trends in order to increase the dataset. Therefore, the dataset used for training provided 105 interpolated data points for each input variable and the same amount of data for the corresponding output variable. The modeling supported by AI algorithms used an inverse approach unlike the forward approach used for data generation (and also used in the mechanistic modeling), as shown in Figure  3b. In the inverse approach, the data-driven model is considered as a black-box model that learns to relate the inputs, Cmaxp, Cmind, and ECLsl to the output, i.e., the concentration of Ru(bpy) 3 2+ , from a large number of sample points. Due to the models supported by AI having very limited extrapolation properties, their predictions are only valid when using values within the range defined by the limits for the input variables.
(a) (b) Figure 3. Schematic diagrams of (a) the procedure for experimental data generation (forward approach) using the mobile phone-based ECL sensor and (b) data-driven modeling (inverse approach) using a feedforward neural network and a random forest.   Figure 3. Schematic diagrams of (a) the procedure for experimental data generation (forward approach) using the mobile phone-based ECL sensor and (b) data-driven modeling (inverse approach) using a feedforward neural network and a random forest.
Following the procedure described above, multiple experiments were performed for different concentrations of Ru(bpy) 2+ 3 distributed in a range of 0.02 to 2.5 µM. This range was established based on prior knowledge of the ECL emission for the Ru(bpy) 2+ 3 /TPrA system [1]. Experimental profiles for C maxp , C mind , and ECL sl were thereby obtained as a function of concentration of Ru(bpy) 2+ 3 . The goal was to include the data containing the most relevant information about the system in the training data. A routine implemented in the R programming environment was used to interpolate these measurements that showed consistent trends in order to increase the dataset. Therefore, the dataset used for training provided 105 interpolated data points for each input variable and the same amount of data for the corresponding output variable.
The modeling supported by AI algorithms used an inverse approach unlike the forward approach used for data generation (and also used in the mechanistic modeling), as shown in Figure 3b. In the inverse approach, the data-driven model is considered as a black-box model that learns to relate the inputs, C maxp , C mind , and ECL sl to the output, i.e., the concentration of Ru(bpy) 2+ 3 , from a large number of sample points. Due to the models supported by AI having very limited extrapolation properties, their predictions are only valid when using values within the range defined by the limits for the input variables.

Random Forest (RF)
A random forest algorithm is a widely used nonparametric technique for data classification and regression analysis. A detailed description of the fundamentals of RF is given by Breiman [21]. In this study, the focus is on the application of RF to obtain a regression between the input variables (Cmaxp, Cmind, and ECLsl) and an output variable (concentration of Ru(bpy) 3 2+ ). The idea of RF is to construct a set of trees from samples randomly selected from the training set by a bootstrapping technique and to generate an average prediction of the individual trees. Overfitting is avoided by the division of nodes into decision trees where the RF algorithm randomly selects a subset of variables for each node. The average of the values in the terminal nodes of the decision trees was used to estimate the  Figure 4b, (e) zoom-in view of the shaded red box in Figure 4c. Figure 4e also shows the current derivative signal (brown line) corresponding to the current response; the green box magnifies these responses. C maxp : maximum value of the current peak, C mind : minimum derivative value of the current, ECL sl : decay slope of the ECL intensity.

Random Forest (RF)
A random forest algorithm is a widely used nonparametric technique for data classification and regression analysis. A detailed description of the fundamentals of RF is given by Breiman [21]. In this study, the focus is on the application of RF to obtain a regression between the input variables (C maxp , C mind , and ECL sl ) and an output variable (concentration of Ru(bpy) 2+ 3 ). The idea of RF is to construct a set of trees from samples randomly selected from the training set by a bootstrapping technique and to generate an average prediction of the individual trees. Overfitting is avoided by the division of nodes into decision trees where the RF algorithm randomly selects a subset of variables for each node. The average of the values in the terminal nodes of the decision trees was used to estimate the concentration of Ru(bpy) 2+ 3 (Figure 3b). Therefore, the predicted value by the entire random forest, h j , is denoted by Equation (1).
h jt , (t = 1, . . . , T) and ( j = 1, . . . , n sample ), where h jt represents the predicted value concentration of Ru(bpy) 2+ 3 by tree t, T represents the total number of trees, and n sample represents the total number of samples from training set.
The leave-one-out cross-validation (LOOCV) technique was employed to train the RF algorithm. In LOOCV, n − 1 samples from the training set are used to train the RF, and the remaining sample is used to evaluate the accuracy; this was repeated 90 times. The RF tuning parameters for the LOOCV were the number of trees to be grown (n tree ), the number of predictor variables used to split the nodes at each partitioning (mtry), and the minimum size of the terminal node or leaf (node size). RF accuracy was assessed on the validation and testing set using performance measures such as mean square error (MSE) and the coefficient of determination (R 2 ). The RF was implemented in the R programming environment using the randomForest package Version 4.6-14 [22], based on Breiman and Cutler's Fortran code [21].

Feedforward Neural Network (FNN)
This work uses an FNN-type artificial neural network (ANN) [23] due to its simple mathematical form and logical architecture for data-driven modeling. These characteristics make it suitable for implementation in a prediction framework, where reduced mathematical complexity is an important factor for real-time prediction. The FNN with an input layer, one hidden layer of sigmoidal neurons, and a layer of linear output neurons was used in this study, where the numbers of neurons were I, J, and M, respectively. The neurons are highly interconnected by weights and bias parameters. Mathematically, the FNN can be represented as Equation (2). where g m and x i represent the vector of input and output variables, f (·) and F(·) represent the activation functions of the j-th neuron in the hidden layer and of the m-th neuron in the output layer, respectively, w ji denotes the weight connecting the i-th neuron in the input layer and the j-th neuron in the hidden layer, θ j denotes the bias of the j-th neuron in the hidden layer, W mj denotes the weight connecting the j-th neuron in the hidden layer and the m-th neuron in the output layer, and b m denotes the bias in the m-th neuron in the output layer. Figure 3b details the input variables (C maxp , C mind , and ECL sl ) and the output variable (concentration of Ru(bpy) 2+ 3 ) used to perform the FNN training. A representative dataset comprising 105 input/output samples was presented to the FNN for estimating the weight and bias (FNN parameters). The data were randomly divided into a training set and a validation set. The predictive performance of FNN was assessed using different measurements (testing set) performed with the mobile phone-based ECL sensor. The appropriate number of neurons in the hidden layer that prevents overfitting of the model and achieves a good generalization of training was determined by cross-validation (CV). CV means that FNNs with different numbers of hidden neurons, that is, different architectures, are trained with the training set, and the performances are assessed on the ability to make accurate predictions of the validation set in terms of R 2 and MSE. The FNN was implemented in the R programming environment using the neuralnet package Version 1.44.2 [24].

Chronoamperometric Data for Data-Driven Modeling
A series of chronoamperometric measurements were performed using the mobile phone-based ECL sensor. The ECL and electrochemical key features were measured at different concentrations of Ru(bpy) 2+ 3 (from 0.02 to 2.5 µM) following the approach proposed in Section 2.4. The key features identified were the maximum value of current peak, C maxp , the minimum derivative value of the current, C mind , and the decay slope of the ECL intensity, ECL sl . The concentrations of Ru(bpy) 2+ were consistent with the practical use of this luminophore as a label. Figure 5 shows the behavior of each key feature considered in this study as a function of the concentration of Ru(bpy) 2+ 3 . These data clearly demonstrate the influence of the concentration of the luminophore on C maxp , C mind , and ECL sl . As concentration of Ru(bpy) 2+ 3 increased from 0.02 to 2.5 µM, the key electrochemical features, C maxp and C mind , decreased as shown in Figure 5a,b, respectively. Meanwhile, ECL sl exhibited lower values at higher concentration of Ru(bpy) 2+ 3 (Figure 5c). Previous studies [7,25] discussed the importance of having systems capable of performing ECL and electrochemical measurements in sync to develop models that investigate the mechanism of the Ru(bpy) 2+ 3 /TPrA system. The consistent downward trend of experimental measurements of C maxp , C mind , and ECL sl with the concentration of the luminophore made it possible for these measurements to be interpolated to generate a large dataset. This strategy allowed for well-distributed data of the key features for the calibration of the AI algorithms. This is a very critical issue that should be addressed, as AI algorithms have very limited extrapolation properties [26]. For example, Figure 5a-c show the measurements (solid symbols) and the interpolated data (continuous lines) used to calibrate the random forest (RF) algorithm. These data and those for calibration of the feedforward neural network (FNN) were randomly divided into a training set (85%) and a validation set (15%). Prior to interpolation, three experimental measurements (i.e., three amperograms and three sets of ECL images) were randomly extracted from the original set of experimental measurements, which determined the testing set. Ru(bpy) 3 2+ (from 0.02 to 2.5 µ M) following the approach proposed in Section 2.4. The key features identified were the maximum value of current peak, Cmaxp, the minimum derivative value of the current, Cmind, and the decay slope of the ECL intensity, ECLsl. The concentrations of Ru(bpy) 3 2+ were consistent with the practical use of this luminophore as a label. Figure 5 shows the behavior of each key feature considered in this study as a function of the concentration of Ru(bpy) 3 2+ . These data clearly demonstrate the influence of the concentration of the luminophore on Cmaxp, Cmind, and ECLsl. As concentration of Ru(bpy) 3 2+ increased from 0.02 to 2.5 µ M, the key electrochemical features, Cmaxp and Cmind, decreased as shown in Figure 5a,b, respectively. Meanwhile, ECLsl exhibited lower values at higher concentration of Ru(bpy) 3 2+ (Figure 5c). Previous studies [7,25] discussed the importance of having systems capable of performing ECL and electrochemical measurements in sync to develop models that investigate the mechanism of the Ru(bpy) 3 2+ /TPrA system. The consistent downward trend of experimental measurements of Cmaxp, Cmind, and ECLsl with the concentration of the luminophore made it possible for these measurements to be interpolated to generate a large dataset. This strategy allowed for well-distributed data of the key features for the calibration of the AI algorithms. This is a very critical issue that should be addressed, as AI algorithms have very limited extrapolation properties [26]. For example, Figure 5a-c show the measurements (solid symbols) and the interpolated data (continuous lines) used to calibrate the random forest (RF) algorithm. These data and those for calibration of the feedforward neural network (FNN) were randomly divided into a training set (85%) and a validation set (15%). Prior to interpolation, three experimental measurements (i.e., three amperograms and three sets of ECL images) were randomly extracted from the original set of experimental measurements, which determined the testing set.

Random Forest (RF) Prediction Results
Several structures of the random forest (RF) with different n tree (number of trees to be grown) were compared to build the model based on RF. The model estimates the concentration of Ru(bpy) 2+ 3 using the maximum value of the current peak, C maxp , the minimum derivative value of the current, C mind , and the decay slope of the ECL intensity, ECL sl , as input variables. Figure 6a shows that, at values greater than n tree of 500, the MSE and R 2 did not show significant improvement. Therefore, the RF tuning parameter, n tree , for the leave-one-out cross-validation (LOOCV) technique was determined to be 500. The remaining tuning parameters were fixed as follows [22]: number of predictor variables used to split the nodes at each partitioning (mtry) = 1.732 (square root of the number of inputs), and minimum size of the terminal node or leaf (node size) = 5. The accuracy of the generated model by the LOOCV technique was assessed by predicting the concentration of Ru(bpy) 2+ 3 for the validation set. Figure 7a shows the actual versus predicted values for this set. The corresponding assessment using the performance measures, R 2 and MSE, demonstrated that the model predictions were particularly accurate. As for the testing set, the RF prediction results were similar to those observed for the validation set. The actual versus predicted values and the performance measures are presented in Table 1. The results showed that the model based on RF can effectively directly infer the concentration of the Ru(bpy) 2+ 3 from certain key features from multimodal data of the mobile phone-based ECL sensor. To the best of the authors' knowledge, the RF was not previously used for the regression analysis of data from electrochemical/ECL sensors because it is relatively easier to understand the mathematical form of parametric models such as the FNN. RF can achieve high precision when a large number of input variables with a large amount of data are used [27]. Nevertheless, this study shows that the use of a reduced number of significant input variables (called key features) achieves accurate prediction results. These results were slightly higher than those found using FNN, as shown in the next section.
Sensors 2020, 20, 625 9 of 14 value of the current peak, Cmaxp, (b) minimum derivative value of the current, Cmind, and (c) decay slope of the ECL intensity, ECLsl.

Random Forest (RF) Prediction Results
Several structures of the random forest (RF) with different ntree (number of trees to be grown) were compared to build the model based on RF. The model estimates the concentration of Ru(bpy) 3 2+ using the maximum value of the current peak, Cmaxp, the minimum derivative value of the current, Cmind, and the decay slope of the ECL intensity, ECLsl, as input variables. Figure 6a shows that, at values greater than ntree of 500, the MSE and R 2 did not show significant improvement. Therefore, the RF tuning parameter, ntree, for the leave-one-out cross-validation (LOOCV) technique was determined to be 500. The remaining tuning parameters were fixed as follows [22]: number of predictor variables used to split the nodes at each partitioning (mtry) = 1.732 (square root of the number of inputs), and minimum size of the terminal node or leaf (node size) = 5. The accuracy of the generated model by the LOOCV technique was assessed by predicting the concentration of Ru(bpy) 3 2+ for the validation set. Figure 7a shows the actual versus predicted values for this set. The corresponding assessment using the performance measures, R 2 and MSE, demonstrated that the model predictions were particularly accurate. As for the testing set, the RF prediction results were similar to those observed for the validation set. The actual versus predicted values and the performance measures are presented in Table 1. The results showed that the model based on RF can effectively directly infer the concentration of the Ru(bpy) 3 2+ from certain key features from multimodal data of the mobile phonebased ECL sensor. To the best of the authors' knowledge, the RF was not previously used for the regression analysis of data from electrochemical/ECL sensors because it is relatively easier to understand the mathematical form of parametric models such as the FNN. RF can achieve high precision when a large number of input variables with a large amount of data are used [27]. Nevertheless, this study shows that the use of a reduced number of significant input variables (called key features) achieves accurate prediction results. These results were slightly higher than those found using FNN, as shown in the next section.

Feedforward Neural Network (FNN) Prediction Results
Different network architectures with a single hidden layer were compared to build the datadriven model based on an FNN that predicts the concentration of Ru(bpy) 3 2+ . The optimal architecture was determined by varying the number of neurons in the hidden layer. In total, 16 architectures were assessed as shown in Figure 6b. The appropriate number of neurons in the hidden layer was chosen using cross-validation with the number of training epochs fixed at 1.0 × 10 5 for all the architectures studied. The FNN with 16 hidden neurons was determined to give the lowest MSE and R 2 closer to that for the validation set ( Figure 6b). Thus, the optimized model used a 3-16-1 (inputhidden neurons-output) architecture containing 81 parameters (weights and bias). Table 2 shows the FNN optimized parameters according to the notation of Equation (2). The comparison between the actual values of the concentration of Ru(bpy) 3 2+ and the corresponding predicted values by the optimized model for the validation set is shown in Figure 7b. The results showed that the model accurately predicted the concentration of Ru(bpy) 3 2+ , as assessed by the R 2 and MSE. For the testing set, it can be seen from Table 1 that the model based on the FNN also described the experimental measurements accurately (R 2 = 0.961, MSE = 0.0356). Nevertheless, the accuracy of this prediction was slightly lower than that observed using random forest (R 2 = 0.996, MSE = 0.0012). Previous studies [28,29] showed that the use of FNN as a data regression method in the development of sensors based on electrochemical measurements provided prediction results with high precision. However, to the best of the authors' knowledge, this is the first study to predict the concentration of a compound using key features from multimodal data (ECL imaging and amperograms) into a single FNN. While FNNs achieved acceptable prediction accuracy for the testing set in this study, further investigations could be performed using deep learning to improve the prediction accuracy of the neural networks. Recent advances in training techniques and increased computational resources made it possible to construct deep neural networks such as the convolutional neural network [30] and recurrent neural network [31]. These novel architectures could be applied to the development of the ECL sensors as they are particularly useful for image processing and time series data.

Feedforward Neural Network (FNN) Prediction Results
Different network architectures with a single hidden layer were compared to build the data-driven model based on an FNN that predicts the concentration of Ru(bpy) 2+ 3 . The optimal architecture was determined by varying the number of neurons in the hidden layer. In total, 16 architectures were assessed as shown in Figure 6b. The appropriate number of neurons in the hidden layer was chosen using cross-validation with the number of training epochs fixed at 1.0 × 10 5 for all the architectures studied. The FNN with 16 hidden neurons was determined to give the lowest MSE and R 2 closer to that for the validation set ( Figure 6b). Thus, the optimized model used a 3-16-1 (input-hidden neurons-output) architecture containing 81 parameters (weights and bias). Table 2 shows the FNN optimized parameters according to the notation of Equation (2). The comparison between the actual values of the concentration of Ru(bpy) 2+ 3 and the corresponding predicted values by the optimized model for the validation set is shown in Figure 7b. The results showed that the model accurately predicted the concentration of Ru(bpy) 2+ 3 , as assessed by the R 2 and MSE. For the testing set, it can be seen from Table 1 that the model based on the FNN also described the experimental measurements accurately (R 2 = 0.961, MSE = 0.0356). Nevertheless, the accuracy of this prediction was slightly lower than that observed using random forest (R 2 = 0.996, MSE = 0.0012). Previous studies [28,29] showed that the use of FNN as a data regression method in the development of sensors based on electrochemical measurements provided prediction results with high precision. However, to the best of the authors' knowledge, this is the first study to predict the concentration of a compound using key features from multimodal data (ECL imaging and amperograms) into a single FNN. While FNNs achieved acceptable prediction accuracy for the testing set in this study, further investigations could be performed using deep learning to improve the prediction accuracy of the neural networks. Recent advances in training techniques and increased computational resources made it possible to construct deep neural networks such as the convolutional neural network [30] and recurrent neural network [31]. These novel architectures could be applied to the development of the ECL sensors as they are particularly useful for image processing and time series data. Contour plots were generated from the validated models (Figure 8a,b for RF and FNN, respectively) for the visualization of the relationships between the input variables (C maxp and ECL sl ) and the concentration of Ru(bpy) 2+ 3 (response variable). It can be seen that the contours for both the FNN and the RF were nonlinear and revealed that the concentration of Ru(bpy) 2+ 3 decreased as the values of C maxp and ECL sl decreased. The magnitude of the effects of the input variables on the response variable can also be inferred from these plots. In this regard, it was observed that the concentration of Ru(bpy) 2+ 3 was more sensitive to the variation of ECL sl than C maxp . Contour plots were especially useful to display the system behavior, given the complexity of the developed models that are nonparametric, such as the RF, or that do not have simple prediction equations as the FNN. As in previous works [26,32], it can be noted that Figure 8a,b show typical behaviors of contour plots generated from a nonparametric model and a parametric model, respectively. In this study, the use of a reduced number of key features allowed for fast calibration and operation of the AI algorithms to predict the concentration of Ru(bpy) 2+ 3 . A greater number of key features could be considered in the construction of the data-driven models; however, some features could have a little or no effect on the response. Therefore, before incorporating more key features into the models, a sensitivity analysis should be performed to determine their potential contribution.
The use of the approach presented in this study to other applications, such as the detection of analytes of interest using the enhancing or quenching of their luminescent intensities, is straightforward. In this case, the concentration of Ru(bpy) 2+ instance, phenolic compounds demonstrated a highly efficient quenching effect in the Ru(bpy) 3 2+ /TPrA system [33]. In this sense, future work will take advantage of the results obtained in this study to develop an AI-driven smartphone-supported ECL sensor to monitor phenolic compounds in wastewater from biofuel plants. In this context, the present study is important because it provides a proof of concept demonstrating the feasibility to develop a sensor for intelligent detection of analytes.

Conclusions
The quantitative investigation of the relationships between the concentration of Ru(bpy) 3 2+ and its experimentally measured electrochemical and ECL features naturally leads to the use of complex models that are very difficult to calibrate. It is necessary to examine key features from the system to effectively consider the generalization of the model. This study proposes a novel modeling approach based on AI (in particular, random forest (RF) and feedforward neural network (FNN)) to correlate the concentration of Ru(bpy) 3 2+ with key features obtained from sequences of ECL imaging and amperograms. All multimodal measurements were extracted from a low-cost smartphone-based electrochemiluminescence (ECL) sensor. The input (key features) and output (concentration of Ru(bpy) 3 2+ ) variables were applied to generate sample points. These samples were used to build data-driven models using RFs and FNNs. The predictions of the data-driven models were shown to be in agreement with the measurements performed (validation and testing sets) with the mobile phone-based ECL sensor. Contour plots allowed quantitative determination of the relevance of the key features on the output and the relation between them. The AI approaches were capable of directly inferring the concentration of Ru(bpy) 3 2+ using easily observable key features, while traditional mechanistic modeling uses a complex calibration procedure. Future work will extend the proposed approach to develop a robust, practical, and affordable sensor for intelligent detection of analytes of economic relevance such as phenolic compounds.