Using a physics-informed neural network and fault zone acoustic monitoring to predict lab earthquakes

Predicting failure in solids has broad applications including earthquake prediction which remains an unattainable goal. However, recent machine learning work shows that laboratory earthquakes can be predicted using micro-failure events and temporal evolution of fault zone elastic properties. Remarkably, these results come from purely data-driven models trained with large datasets. Such data are equivalent to centuries of fault motion rendering application to tectonic faulting unclear. In addition, the underlying physics of such predictions is poorly understood. Here, we address scalability using a novel Physics-Informed Neural Network (PINN). Our model encodes fault physics in the deep learning loss function using time-lapse ultrasonic data. PINN models outperform data-driven models and significantly improve transfer learning for small training datasets and conditions outside those used in training. Our work suggests that PINN offers a promising path for machine learning-based failure prediction and, ultimately for improving our understanding of earthquake physics and prediction.

Summary: The authors have developed a workflow to design and train a neural network to predict shear stress and slip speed on a laboratory fault using physics-based regularization terms in the loss function. They found that the addition of physics-based regularization helped find more generalizable models (perform better on test data set) that still perform well when few training data are available. The manuscript is well written, the results are clearly presented and the benefits of using physics-based regularization are clearly demonstrated. The data and codes used in the study are publicly available online. My comments are only about the message delivered by the paper, I don't think that any additional technical work is necessary.
Major comments: -The authors introduce their work in the context of the great challenge of predicting failure in the field (either for natural, tectonic earthquakes or induced earthquakes) and claim that this study will help design future models for field applications. While I'm fully convinced that such physicsbased regularizers should be included as much as possible in geophysical studies, I think that the authors should elaborate more on how their study relates to a possible application in the field. The model uses two variables, the shear stress and slip speed on the fault, that are completely unavailable in the field with today's technology. Getting the shear stress and slip speed in space and time on natural faults is certainly a much harder problem than using these variables to predict the next failure and a neural network might then not even be necessary to anticipate failure. In its current shape, I find the study to be more a proof that failure is predictable in a relatively simple rock sample than an actual step towards predicting failure on natural faults.
-The answer to this question might be in the "Data-Driven Models" section but I wasn't sure: For a given size of the training set, do you use the same MLP architecture for all three models? It seems to be an important point because if you repeat the hyperparameter tuning and find different optimal architectures for the data-driven vs PINN 1 vs PINN 2 model, then you are not only comparing different loss functions but also different architectures, which complicate the interpretation of the comparison.
Minor comments: -Lines 32-35: The authors use the term "acoustic emissions" to talk equally about discrete events and the continuous recordings, which I find confusing. It gives the impression that 100% of the signal in the continuous recordings is due to failure events whereas part of it is "ambient noise". Although "ambient noise" in the lab might very well mostly be caused by the reverberations of the waves emitted by the AE events, I think it's important to note that waves that are not radiated from the monitored fault can carry relevant information about the elastic properties of the bulk around the fault. In that sense, I think the two sentences lines 32-35 are misleading because they say that hidden, active sources carry information whereas "ambient noise" also carries relevant information. In other terms, the relevant information might not be that much in the source signature than in the path signature. It's probably me not interpreting well the sentences but I suggest making the phrasing clearer as I might not be the only one reading it this way.
Reviewer #2 (Remarks to the Author): Dear Authors and Editor, I am writing to provide my reviews for the paper NCOMMS-22-50565-T titled "Using a physicsinformed neural network and fault zone acoustic monitoring to predict lab earthquakes" by Prabhav Borate et al.
The paper describes how physics-informed neural networks can be used to improve the regression between experimental observations (ultrasonic wavespeed and transmission amplitude) and the shear stress and slip rate of a loaded block analog to natural fault producing lab quakes. I found the experimental setup and data processing stages to be clearly described and of high quality. Importantly, the authors address the question of how physics-based constraints can be used to improve machine-learning based regression. The authors show that integrating physics-based knowledge into the neural network can enhance its performance in comparison to classical multilayer perception models, particularly when dealing with limited training data. This finding is particularly relevant for potential applications to field data, where multiple seismic cycles are not always observed. Furthermore, they highlight the tradeoff between imposing too much a priori knowledge versus providing enough flexibility to generalize the results, with a transfer learning experiment.
Overall, I believe the paper is of high quality and has the potential to make a significant contribution to the field. With some minor corrections, I recommend that it should be considered for publication. Please find below my general remark, which I hope will be useful to the authors in revising their work.
Regarding my general remark, I would like to draw your attention to the observation made in the transfer learning experiment, where the less constrained PINN outperforms the most constrained one. While the authors provide a sentence to discuss this result, I think it may deserve more insights. As stated by the authors, the likely explanation for this is the too much constrained loss function in PINN#2, which incorporates ultrasonic attributes and may not be transferable straightforwardly between experiments. However, I do not fully understand why this point is not tackled by the fine-tuning performed in the transfer learning approach. In other words, would this point be better addressed by retraining PINN#2 sufficiently or performing a larger fine-tuning, or is the network stuck in a local minimum?
In terms of minor comments, I suggest capitalizing "physics-informed neural network" to emphasize the definition of the acronym PINN, both at line 14 and when it is redefined at line 49. Also, "SHM" is defined as an acronym but only used once; therefore, you may consider removing it at line 17.
Regarding Figure 1a, the shear stress and slip rate are encoded with the same color, and the line styles do not allow distinguishing them. Would it be possible to differentiate them with two colors? Additionally, in Figure 1b, you may consider using the label "Normalized shear stress and slip rate" instead of "Normalized", and the "Time" label may also refer to "Ultrasonic signal local time".
In the Ultrasonic Feature Extraction section, could you please clarify how you select the reference waveform, and whether your results are reference-dependent? Finally, PZT is defined but used only once at line 102.
Thank you for the opportunity to review this manuscript. Please do not hesitate to contact me if you have any questions or require any further information. Sincerely.
Reviewer #3 (Remarks to the Author): Comments for Borate et al. "Using a physics-informed neural network and fault zone acoustic monitoring to predict lab earthquakes " This study shows the superiority of physics-informed neural networks (PINN) compared to datadriven neural networks for predicting shear stress history and slip rates in laboratory shear-slip experiments. The structure of the paper is straightforward and clearly demonstrates the superiority of PINN, and I did not find any particular problem. The manuscript is clearly written, and I personally think it is almost ready for publication. One concern is that the comparison target is a simple multilayer perceptron neural network that is not expected to perform well, so the paper is not very interesting in terms of performance. However, this study shows that PINN, which includes simple physical rules, can be effective to a certain extent, and I consider this study worth publishing.
P.3 L 104: Are the piezoelectric transducers used as a transmitter and a receiver the same type? If possible, it would be desirable to specify the manufacturer and model number.

Response to reviewers
The authors thank the reviewers for their insightful comments, which have improved and clarified the manuscript. The revisions made to the manuscript in response to the reviewers' comments are all noted in red for your reference. Below are our answers to all the comments. The reviewers' comments are written in black, and our responses are in blue.

Reviewer #1 (Remarks to the Author):
Summary: The authors have developed a workflow to design and train a neural network to predict shear stress and slip speed on a laboratory fault using physics-based regularization terms in the loss function. They found that the addition of physics-based regularization helped find more generalizable models (perform better on test data set) that still perform well when few training data are available. The manuscript is well written, the results are clearly presented and the benefits of using physics-based regularization are clearly demonstrated. The data and codes used in the study are publicly available online. My comments are only about the message delivered by the paper, I don't think that any additional technical work is necessary.
Major comments: 1. The authors introduce their work in the context of the great challenge of predicting failure in the field (either for natural, tectonic earthquakes or induced earthquakes) and claim that this study will help design future models for field applications. While I'm fully convinced that such physics-based regularizers should be included as much as possible in geophysical studies, I think that the authors should elaborate more on how their study relates to a possible application in the field. The model uses two variables, the shear stress and slip speed on the fault, that are completely unavailable in the field with today's technology. Getting the shear stress and slip speed in space and time on natural faults is certainly a much harder problem than using these variables to predict the next failure and a neural network might then not even be necessary to anticipate failure. In its current shape, I find the study to be more a proof that failure is predictable in a relatively simple rock sample than an actual step towards predicting failure on natural faults.
We completely agree with the reviewer that such an approach is not readily applicable at this time, but we would like to promote a broader view of physics-informed approaches.
We have added a paragraph in the discussion section (lines 230-240) to clarify this aspect and provide arguments as to why this study nonetheless represents a step toward prediction in the future.
"This study demonstrates that adding physics-based constraints to ML models is greatly beneficial for failure prediction, especially when datasets are scarce. On the other hand, we recognize that the model developed here cannot be directly applied to field data, because shear stress and slip rate data at depth are not accessible in the field. Moreover, very few active seismic surveys performed continuously over extended periods of time are available (Niu et al., 2008). Nonetheless, we believe this work represents a step toward failure prediction in the field for the following reasons. First, an approach similar to the one presented here still using lab data might be followed to better constrain the rate and state frictional models and associated parameters that are used in geodetic studies to infer fault slip distribution at depth (Avouac, 2015;Barbot et al., 2012;Bürgmann et al., 2002;Fielding et al., 2013;Gualandi et al., 2017;Hearn & Bürgmann, 2005;Kano et al., 2018;Michel et al., 2019;Wallace et al., 2016). Second, stress and slip rate data might be inferred in the field with ML models using earthquake recurrence as input data, and possibly pre-training on lab data.'' 2. The answer to this question might be in the "Data-Driven Models" section but I wasn't sure: For a given size of the training set, do you use the same MLP architecture for all three models? It seems to be an important point because if you repeat the hyperparameter tuning and find different optimal architectures for the data-driven vs PINN 1 vs PINN 2 model, then you are not only comparing different loss functions but also different architectures, which complicate the interpretation of the comparison.
Yes, for a given size of the training data split, all the models (data-driven, PINN 1, and PINN 2) use the same MLP architecture. The PINN 1 and PINN 2 frameworks are developed based on the data-driven models, therefore all three models share the same architecture, including the number of hidden layers and units, batch size, optimizer, and learning rate. To clarify these points, the following sentence is added to the revised manuscript in the Results and Discussion section (lines 141-143).
"The data-driven, PINN 1, and PINN 2 models share the same MLP framework (hidden layers, units, batch size, optimizer, and learning rate) across different data splits to allow one on one comparison." Minor comments: 1. Lines 32-35: The authors use the term "acoustic emissions" to talk equally about discrete events and the continuous recordings, which I find confusing. It gives the impression that 100% of the signal in the continuous recordings is due to failure events whereas part of it is "ambient noise". Although "ambient noise" in the lab might very well mostly be caused by the reverberations of the waves emitted by the AE events, I think it's important to note that waves that are not radiated from the monitored fault can carry relevant information about the elastic properties of the bulk around the fault. In that sense, I think the two sentences lines 32-35 are misleading because they say that hidden, active sources carry information whereas "ambient noise" also carries relevant information. In other terms, the relevant information might not be that much in the source signature than in the path signature. It's probably me not interpreting well the sentences but I suggest making the phrasing clearer as I might not be the only one reading it this way.
We agree with the reviewer that this paragraph needs to be corrected and clarified. The passive data collected during such experiments are recorded continuously and contain impulsive events, as well as a featureless signal that looks like noise. Our past work suggests that this featureless signal can be either due to a lack of events (locked fault), or 3 many small AE events that overlap and cannot be isolated. The latter case was encountered when attempting to catalog events (Bolton et al., 2020), where we find that there is a deficit of small events at such low amplitudes (and this magnitude of completeness is found to increase with the imposed shearing velocity). From our current work and work in progress (Marty et al., in preparation), we know that the vast majority of events occur on the fault plane, but we agree that the energy captured by the sensors also comes from subsequent scattering, carrying information about the stress state within the host rock. The paragraph now reads as follows (lines 28-40 of the revised manuscript): "Numerous laboratory studies have shown that the onset of failure is associated with bursts of acoustic emission (AE) events taking place during crack initiation and growth, and the number and amplitude of AE events generally increase as the sample approaches failure Bu et al., 2022;Chow et al., 1995;Dunegan & Harris, 1969;Jansen et al., 1993;Lockner, 1993;Rivière et al., 2018;Rouet-Leduc et al., 2017;Savage & Hasegawa, 1964;Scholz, 1968). Recent friction studies on laboratory faults have shown that machine learning (ML) algorithms can actually predict the timing and magnitude of lab quakes using AE data Hulbert et al., 2019;Jasperson et al., 2021;Laurenti et al., 2022;Lubbers et al., 2018;Pu et al., 2021;Rouet-Leduc et al., 2017Wang et al., 2021Wang et al., , 2022. It is remarkable that solely using acoustic emission data radiating from the faults as an input, the fault strength can be accurately predicted throughout the laboratory seismic cycle Rouet-Leduc et al., 2018). Past work (Blanke et al., 2021;Goebel et al., 2012Goebel et al., , 2013 has shown that the vast majority of events radiate from the fault plane, therefore carrying information about the fault state. And as the elastic waves radiate/scatter through the host granite blocks, they also provide information about the stress state of the host rock. It is also remarkable that predictions work in the early stage of the seismic cycle when the acoustic signal often looks like noise, either because it lacks a clear P-wave, such as expected for friction/fracture events, or because it represents something like tectonic tremor involving the sum of many small or low frequency events that overlap in time and cannot be distinguished as separate events (Bolton et al., 2020)." 4

Reviewer #2 (Remarks to the Author):
Dear Authors and Editor, I am writing to provide my reviews for the paper NCOMMS-22-50565-T titled "Using a physicsinformed neural network and fault zone acoustic monitoring to predict lab earthquakes" by Prabhav Borate et al. The paper describes how physics-informed neural networks can be used to improve the regression between experimental observations (ultrasonic wavespeed and transmission amplitude) and the shear stress and slip rate of a loaded block analog to natural fault producing lab quakes. I found the experimental setup and data processing stages to be clearly described and of high quality. Importantly, the authors address the question of how physics-based constraints can be used to improve machine-learning based regression. The authors show that integrating physics-based knowledge into the neural network can enhance its performance in comparison to classical multilayer perception models, particularly when dealing with limited training data. This finding is particularly relevant for potential applications to field data, where multiple seismic cycles are not always observed. Furthermore, they highlight the tradeoff between imposing too much a priori knowledge versus providing enough flexibility to generalize the results, with a transfer learning experiment. Overall, I believe the paper is of high quality and has the potential to make a significant contribution to the field. With some minor corrections, I recommend that it should be considered for publication. Please find below my general remark, which I hope will be useful to the authors in revising their work.
1. Regarding my general remark, I would like to draw your attention to the observation made in the transfer learning experiment, where the less constrained PINN outperforms the most constrained one. While the authors provide a sentence to discuss this result, I think it may deserve more insights. As stated by the authors, the likely explanation for this is the too much constrained loss function in PINN#2, which incorporates ultrasonic attributes and may not be transferable straightforwardly between experiments. However, I do not fully understand why this point is not tackled by the fine-tuning performed in the transfer learning approach. In other words, would this point be better addressed by retraining PINN#2 sufficiently or performing a larger fine-tuning, or is the network stuck in a local minimum?
Thank you for this comment. Besides hyperparameter tuning carried out in our study, we present the results for cosine decay, and layer freezing in our transfer learning (TL) study to confirm whether such a detailed fine-tuning can address this point. Selecting an appropriate learning rate during model training is crucial for optimizing the weights. Optimization diverges if the learning rate is set too high and if it is set too low, convergence is slow. For all the transfer learned (TL) models, hyperparameter tuning using grid search resulted in finding an optimal learning rate of 1e-3 and batch size of 32, as reported in the originally submitted manuscript in lines 317 to 319.
Cosine decay is one of the most widely used approaches for learning rate decay, which help the model move away from the saddle point and helps in converging and reaching better local minima. Cosine decay including other variants such as polynomial decay is commonly used to stably train transformers and other SoTA models. As shown in Figure 1 below, we started the model with a learning rate of 10 -2 and gradually decreased it to 10 -4 as training progressed. Table 1 below illustrates the performance of the PINN #1 and PINN #2 models using this decay for varying training data sizes. Consistent with the results presented in the paper, PINN #1 models outperform PINN #2 models with consistently higher R 2 scores and align with the results shown in our manuscript. Figure 1: Model training with a varying learning rate from 10 -2 to 10 -4 following the cosine decay schedule. The PINN models developed have 5 hidden layers. For the fine-tuning study, the models trained on the p5270 dataset have K frozen (not trainable) layers and (N-K) trainable layers, where N is the total number of layers and K is the total number of frozen layers. We use (N-K) layers to optimize our model on the p5271 dataset, to better understand the feature importance that is learned on the prior dataset. Table 2-5 compares the performance of the PINN #1 and PINN #2 models for this investigation on the testing dataset with 70%-10% training dataset respectively. Again, the PINN #1 models consistently outperform the PINN #2 models in predicting shear stress and slip rate in all scenarios of fine-tuning. The models perform best when all of the layers are unfrozen (allowed to train); and all model weights are initialized using the pre-trained network, such that fine-tuning them with a low learning rate leads to better performance, on the other hand, performance starts degrading as (N-K)(number of the trainable or unfrozen layer) < K (frozen layer).
6    In addition to the fine-tuning described above, we carried out a study in which we built standalone PINN #1 and PINN #2 models on the p5271 dataset (TL dataset). The corresponding performances are shown in Table 6 below for the models using a split of 70%-10%-20% for training, validation, and testing. According to the results, the PINN #2 model performs better than PINN #1 model, much like standalone models for the p5270 dataset. These studies support our observation that in the TL case, the PINN #1 models consistently perform better than PINN #2 models possibly because they are less constrained (exclusion of ultrasonic attributes). We do not have other explanation for this observation. To address this, the following statement is added to the revised manuscript in lines 213 to 215: "Further model tuning with cosine decay schedule and fine-tuning (freezing one or more layers) show that the TL PINN #1 models consistently outperform the TL PINN #2 models in predicting shear stress and slip rate in all scenarios." 2. In terms of minor comments, I suggest capitalizing "physics-informed neural network" to emphasize the definition of the acronym PINN, both at line 14 and when it is redefined at line 49. Also, "SHM" is defined as an acronym but only used once; therefore, you may consider removing it at line 17.
Thank you. We modified lines 14 and 49 from "physics-informed neural network" to "Physics-Informed Neural Network" to emphasize the definition of the acronym PINN, and we removed the acronym "SHM".
3.Regarding Figure 1a, the shear stress and slip rate are encoded with the same color, and the line styles do not allow distinguishing them. Would it be possible to differentiate them with two colors? Additionally, in Figure 1b, you may consider using the label "Normalized shear stress and slip rate" instead of "Normalized", and the "Time" label may also refer to "Ultrasonic signal local time".
We revised Figure 1a according to your suggestion. The slip rate is now plotted in black and shear stress is in purple. The line widths are also increased to make the figure clearer and more readable. In Figure 1b, the left y-axis label is changed from "Normalized" to 8 "Normalized Shear Stress & Slip Rate" and the right y-axis label is changed from "Time" to "Ultrasonic Signal Local Time". 4.In the Ultrasonic Feature Extraction section, could you please clarify how you select the reference waveform, and whether your results are reference-dependent?
The reference waveform ! is chosen past the peak friction, and right before the fault starts its transition from stable sliding to unstable seismic cycles as marked in Figure 1a in the revised manuscript.
The two features, namely wave speed " and spectral amplitude " , are extracted from the recorded waveforms. The amplitude " is calculated using the Fourier transform of the recorded signals so it is independent of the reference waveform. The extraction of " involves the time of flight " calculations that consider the arrival time ! (reference waveform) and estimated time delay ∆ " , which is calculated through cross-correlation. Since the waveform shapes change only slightly and cross-correlation coefficients for the recorded waveforms always remain higher than 0.97, the choice of reference waveform does not significantly affect the extracted feature " . While we have not conducted a systematic study to investigate this (i.e., using datasets with different reference waveforms as input to the ML models), it is our experience that when the correlation coefficient is very close to one, the extracted wave speeds are unaffected by the choice of reference.
To address this point, lines 115 to 119 in the original manuscript are rewritten as follows: "To calculate the evolution of wave speed during frictional sliding, we first extract the time delay $\Delta t$ by cross-correlating each waveform $S_i$ with a reference waveform $S_0$. The reference waveform is chosen past the peak friction just before the fault starts its transition from stable sliding to unstable seismic cycles (thin vertical dashed line at time = 2065 s in Figure 1a). The shape of the recorded waveforms $S_i$ changes little throughout the experiment such that the cross-correlation coefficient remains always greater than 0.97." 5.Finally, PZT is defined but used only once at line 102.
Thank you for pointing this out. The acronym PZT was removed from line 102.

Reviewer #3 (Remarks to the Author):
Comments for Borate et al. "Using a physics-informed neural network and fault zone acoustic monitoring to predict lab earthquakes " This study shows the superiority of physics-informed neural networks (PINN) compared to datadriven neural networks for predicting shear stress history and slip rates in laboratory shear-slip experiments. The structure of the paper is straightforward and clearly demonstrates the superiority of PINN, and I did not find any particular problem. The manuscript is clearly written, and I personally think it is almost ready for publication. One concern is that the comparison target is a simple multilayer perceptron neural network that is not expected to perform well, so the paper is not very interesting in terms of performance. However, this study shows that PINN, which includes simple physical rules, can be effective to a certain extent, and I consider this study worth publishing.
1. P.3 L 104: Are the piezoelectric transducers used as a transmitter and a receiver the same type? If possible, it would be desirable to specify the manufacturer and model number.
Yes, the same type of piezoelectric transducer is used for the transmitter and the receiver. The piezoelectric disks are 12.7 mm in diameter, 4 mm thick made of material 850 from American Piezo Ceramics (APC International). This information is now included in the revised manuscript in lines 104 to 106 as follows: "The two identical piezoelectric disks, used as transmitter and receiver, are 12.7 mm in diameter, 4 mm thick (corresponding to a center frequency of 500 kHz) made of material 850 from American Piezo Ceramics (APC International)." 2. P. 4 L 123: Does this 10-point backward-looking average affect the results?
Yes, the feature smoothing is part of data pre-processing and it does improve the datadriven model performance (Shreedharan et al., 2021). We note that the averaging has been done carefully (backward-looking) and the same pre-processed data are used for the data-driven and PINN models.
3. P. 10 L 301 I did not understand how the parameters of σ, K, ... were obtained in the PINN approach. Could you add the description?
The actual values of some experimental parameters are known (normal stress ( ), shearing rate ( # ), and density ( )) or measurable (system stiffness ( )). But to avoid any errors due to the unit mismatch between the features (spectral amplitude, wave speed), outputs (shear stress, slip rate), and these constants in the physics constraints, we treat them as learnable parameters that are learned alongside neural network parameters such as weights and biases in the proposed PINN framework. In addition, this provides an opportunity to check the model's working by comparing the learned vs true parameters.
These weights are updated during model training using stochastic gradient descent and its variants. Once the model is fully trained, these learned weights are extracted from the model layers, scaled back (to undo normalization), and compared with the true available experimental values as shown in Table 2 of the originally submitted manuscript. This gives us the chance to look into the models' ability to correctly learn the values of these constants.
To address this comment following paragraph is added in the revised manuscript in lines 325 to 334: Although these parameters are either known ($\sigma, $V_l$, $\rho$, $k$) or measurable ($A_{Intact}$ could be measured by testing an intact granite block of the same thickness as the cumulative thickness of the blocks used in the friction experiment), we treat them as trainable neural network weights in the PINN framework. This approach is used to avoid errors due to unit mismatch between features, outputs, and these constants in the constraints. These weights are extracted from the layers of the fully trained models and converted back to the original scale to undo the effect of data normalization (see implementation details in the https://github.com/prabhavborate92/PINN_Paper.git). A comparison of the scaled learned weights with the known parameter values gives us the opportunity to examine the PINN model by determining how well the models are able to learn the values of parameters measured experimentally.
4. Fig. 1 It is difficult to distinguish shear stress and slip rate in this figure. I want to suggest using different colors.
Thank you. Reviewer 2 made the same comment. We have revised Figure 1 as shown below. The slip rate is now plotted in black and shear stress is in purple. The line widths are also increased to make the figure clearer and more readable. 5. The authors used a simple perceptron model as a data-driven model. In my opinion, for such time series data, the convolutional neural network approach can obtain better performance, possibly comparable to PINN results. Are the authors performed such validation?
Thank you for this comment. We did perform additional results using the convolution layer to test whether the filters can stably extract the local temporal information from the data. However, as discussed in this work, physics constraints reduces the overall parameter search space and help the model generalize without compromising the convergence speed. Complex models such as CNN due to tensor weights often lead to over-parameterization leading to sub-optimal results while working with temporal dynamics signals, the ones used in this study. Our results validate this hypothesis. In our experiments CNN has the same set of parameters as our MLP and PINN, we did this to ensure the memorization effect due to higher parameter count can be avoided and we can have a fair comparison with our models.
Here, we discuss the results obtained when implementing a convolutional neural network (CNN) on the p5270 dataset used in this study. The CNN architecture consists of a standard setup comprising of convolutional, activation, max pooling, and flatten layers as shown in Figure 1 below. Similar to our MLP models, the two input features (wave amplitude and speed) are provided with 3-sec history (300 samples for each feature, i.e., 600 samples in total) before the current time to predict shear stress and slip rate at the current time. The CNN layers have tensor weights, that convolve across features to generate the weighted sum and are then passed through the ReLU activation function, we then apply max-pooling over the output of the last layer. The generated 2D arrays from pooled feature maps are all flattened into a single, lengthy continuous linear vector.
To estimate shear stress and slip rate, this vector is provided as input to the fully connected layer. The CNN model is implemented with a Training-Validation-Testing split of 70%-10%-20%, and optimized using mean squared error. Furthermore, we perform a grid search over learning rate and batch size to find the optimal hyperparameter for our CNN model. The performance of the model (R 2 score) in predicting shear stress and slip rate is tabulated in Tables 1 and 2 respectively. Although CNN is known to be effective at extracting discrete signal features, the poor model performance suggests that it is not responsive to temporal dynamics, which is essential here.
Furthermore, we have done another study on the same p5270 dataset (manuscript in preparation), which is relevant to this discussion. In this study, we use CNN layers to (automatically) extract features from raw ultrasonic signals recorded during stick-slip experiments. The best shear stress prediction performance using these automatically extracted features for a model developed with a training-validation-test split of 70%-10%-20% is provided in Table 3. A comparison with the results presented in this paper indicates that CNN-extracted features do not lead to a better performance than hand-picked features (wave speed, wave amplitude, and center frequency).
These studies show that CNN models do not perform better than MLP models. Although using CNN layers is appropriate for extracting features from raw ultrasonic signals, due to the averaging involved, the resulting models are inferior to the models that use expertcrafted features (wave speed, amplitude, and center frequency).