Analysis of the Composition of Ancient Glass and Its Identiﬁcation Based on the Daen-LR, ARIMA-LSTM and MLR Combined Process

: The glass relics are precious material evidence of the early trade and cultural exchange between the East and the West. To explore the cultural differences and trade development between early China and foreign countries, it is extremely important to classify glass cultural relics. Despite their similar appearances, Chinese glass contains more lead, while foreign glass contains more potassium. In view of this, this paper proposes a joint Daen-LR, ARIMA-LSTM, and MLR machine learning algorithm (JMLA) for the analysis and identiﬁcation of the chemical composition of ancient glass. We separate the sampling points of ancient glass into two systems: lead-barium glass and high-potassium glass. Firstly, an improved logistic regression model based on a double adaptive elastic network (Daen-LR) is used to select variables with both Oracle and adaptive classiﬁcation characteristics. Secondly, the ARIMA-LSTM model was used to establish the correlation curve of chemical composition before and after weathering and to predict the change in chemical composition with weathering. Thirdly, combining the data processed by the above two methods, a multiple linear regression model (MLR) is used to classify unknown glass products. It was shown that the sample obtained by this processing method has a very good ﬁt. In comparison with other similar types of models like Decision Trees (DT), Random Forests (RF), Support Vector Machines (SVM), and Random Forests based on classiﬁcation and regression trees (CART-RF), the classiﬁcation accuracy of JMLA is 97.9% on the train set. The accuracy rate on the test set reached 97.6%. The results of the research demonstrate that JMLA can improve the accuracy of the glass type classiﬁcation problem, greatly enhance the research efﬁciency of archaeological staff, and gain a more reliable result.


Introduction
Machine learning (ML) algorithms are a set of mathematical models and statistical [1] methods that can be used in computer systems to learn and make predictions or decisions based on patterns in data. In the field of archaeology, there are many examples of machine learning algorithms applied in the direction of conservation and restoration, provenance research, and the management of cultural heritage. In 1798, the German scientist M.H. Klaproth conducted the first quantitative chemical study of Roman-era glass [2], improving the procedure for weight analysis and devising various procedures for the determination of non-metallic elements, accurately determining the composition of nearly 200 minerals and various industrial products. In 2003, Professor Fuxi Gan and his research team used the proton-excited X-fluorescence (PIXE) technique to quantify the chemical composition of a batch of ancient glass excavated in Yangzhou and Hubei, with the goal of studying the origin, system, and preparation process of ancient Chinese glass [3]. As more and more ancient silicate artifacts were unearthed, some scholars began to classify them based on their chemical composition. In 1992, Korean scientist Lee Chul applied his chemometric pattern recognition method to multivariate data to determine the classification of 94 ancient Korean glass pieces using neutron activation analysis and principal component analysis [4]. In 2010, El-Taher, an Egyptian scholar, used instrumental neutron activation analysis (INAA) and HPGe detector γ-spectroscopy to determine qualitatively and quantitatively for the first time a total of 16 elements in feldspar rock samples collected from Gabel El Dubb, Eastern Desert, Egypt, and to classify their rock samples [5]. In 2011, Thai scholar Won-in K. and his team used Raman spectrophotometry for the first time to characterize fragments of archaeological glass samples with the aim of obtaining information to identify glass samples for classification by laser scattering [6]. In 2019, Nadine Schibille and her team established a temporal model that serves as a tool for dating archaeological glass assemblages as well as a geographical model that allows for a clear classification of Levantine and Egyptian plant ash glasses [7]. However, it is worth noting that the application and extension of machine learning algorithms in the direction of cultural heritage (CH) component analysis and identification of categories are very limited [8].
In recent years, when studying the chemical composition of ancient glass objects, the classification of glass has been mainly determined by the weight ratio of oxides or by analyzing the mass fraction of compounds containing lead and potassium [9][10][11][12][13]. However, the percentage of lead and potassium compounds present varies depending on the region where the glass was produced and the degree of weathering, which would interfere with the classification of the glass. Thereby, this study is based on the data related to ancient glassware provided by the official website of the 2022 China Student Mathematical Modeling Competition [14]. The weathering of glass over thousands of years can cause significant changes in its internal chemical composition. As a result, determining the type of glass by the amount of content in a certain chemical composition is not reliable or scientific. Therefore, based on the double adaptive elastic net improved logistic regression model (Daen-LR), ARIMA-LSTM model, and multiple linear regression model (MLR) [15,16], optimizing and combining these three algorithms, we propose a processing method and process that is suitable for classifying complex data and can be used to predict the unknown classification of glass. This process method is used to analyze and model the data related to the chemical composition and classification information of a batch of ancient Chinese glass products, to find out the correlation between their chemical composition and the basis of their classification, and to use this relationship to predict the category of unknown glass. The accuracy of the algorithm model was judged by testing the presence of heteroskedasticity in the perturbation terms, testing for multicollinearity, and testing the fit of the experimental values through the model to the actual values [17].
With the continuous development of machine learning technology, a variety of machine learning models have been proposed and widely used in classification research. These include Logistic Regression (LR), Naive Bayes(NB), Decision Tree(DT), Support Vector Machine(SVM) and Random Forest(RF), Gradient Boosting Tree(GBT), and so on. However, traditional machine learning methods have some drawbacks in solving real-world problems, such as interference from external factors, failure to meet scientific standards, random results, and poor prediction accuracy. In order to solve these problems, it is necessary to combine sophisticated machine learning methods with more advanced methods. This paper presents a joint machine learning algorithm using Daen-LR, ARIMA-LSTM, and the MLR model (JMLA). We first use an improved logistic regression model based on double adaptive elastic networks (Daen-LR) to select variables that have both Oracle and adaptive classification properties. Secondly, we use the ARIMA-LSTM model to balance the linear and nonlinear trends in the time series data of the chemical content of glass artifacts before and after weathering. Finally, a multiple linear regression model (MLR) In this analysis of ancient glass artifacts, the relationship between glass weathering and its chemical composition was identified and statistically analyzed by using an improved logistic regression model based on a double adaptive elastic net, i.e., the Daen-Logistic regression (Daen-LR) model. Furthermore, we calculated the p-value of each correlation factor and counted whether each regression coefficient was significant at the 90% confidence level, extracted the strong correlation elements, and excluded the weak correlation elements. The characteristics of the model are as follows: Logistic regression is an effective method to solve classification problems in which effective estimation of parameters and selection of variables are extremely important. The regularization method [18], which considers adding a penalty term to the optimized loss function to estimate parameters, can simultaneously solve the two key points of logistic regression. Elastic net [19] is one of the representatives of this method.
However, considering the inadequacy of the traditional logistic regression model for the estimation of parameters and the identification of important variables, it has two major shortcomings: First, the selected variables may not be consistent, i.e., they lack oracle properties [20]. Second, the specific effects of strongly correlated variables on the independent variables are not considered, i.e., adaptive categorical effects are missing [21,22].
To overcome the first deficiency of Elastic net, adaptive elastic net is established by combining Adaptive lasso [23] and Ridge to achieve consistency in selected variables. However, the Adaptive coefficient vector W1, which makes an adaptive elastic net with oracle properties, is not easy to set correctly. It is generally determined by the initial estimates of parameters and the constant δ.
To solve the second defect of Elastic net, Van et al. [24] proposed a Generalized ridge in which parameters are first divided into groups and then given different Ridge penalties for each group. The Generalized ridge has an adaptive grouping effect, and its Adaptive ridge also enjoys that effect. However, Generalized ridge does not have the function to select variables and is of limited application.
Based on existing solutions to the Elastic net deficiency, it follows that Adaptive lasso and Adaptive ridge have oracle properties and adaptive grouping effects, respectively, so they can be combined to avoid the two existing disadvantages. This combination of penalties can be called the double adaptive elastic net.
It is assumed that in the composition analysis of glass artifacts, there are m chemical composition influence factors, X = (x 1 , x 2 , · · ·, x m ) is the characteristic variable of chemical composition content (i.e., independent variable), m is the number of variables, and the weathering status of the corresponding glass artifacts is set as y (i.e., dependent variable), where y represents the dichotomous variable of weathering or not (i.e., 0 means "unweathered" and 1 means "weathered"). To assess the magnitude of the probability of whether a particular glass artifact is weathered, it is necessary to calculate the predicted outcome of the model as the probability of occurrence of y = 1, which can then be expressed as P = f (y = 1 | x 1 , x 2 , · · ·, x m ), i.e., the mathematical expression of the traditional logistic regression model is: i.e., In the above equation, (β 0 , β 1 , β 2 , · · · , β m ) are the regression coefficients to be determined. The great likelihood estimation method is used to find these coefficients: Combining the probability functions of y as: According to the Bernoulli distribution, the maximum likelihood function can be expressed as: The log-likelihood function is expressed as: Since the maximum likelihood function is a convex function, the point at which its first derivative equals zero is the point of maximum value. By calculating the first derivative of the undetermined coefficient (β 0 , β 1 , β 2 , · · ·, β m ) in Equation (6) and setting it equal to zero, all the parameters to be solved in the equation group can be solved.
Considering the shortcomings and deficiencies of the traditional logistic regression model, the oracle effect and adaptive classification effect are integrated into the traditional logistic model to create a double adaptive elastic net model, which makes the identification results of glass cultural relics more relevant and persuasive [25,26].

Theory 2. Adaptive Classification Effect
Given the binary data {( be the estimate of the model and assume thatβ k (λ 1 , By combining the above two schemes to improve the logistic regression equation, Since incorporating the correlation of variables into the regression model helps to improve the accuracy of parameter estimation and variable selection [27], where ρ kj = corr x k , x j is the correlation coefficient between variables x k and x j , (ε 1 , ε 2 , · · ·, ε m ) T is the vector that can make w 21 , w 22 , · · ·, w 2m unequal to each other, and where α = λ 1 λ 1 + λ 2 , t > 0.
Using the coordinate gradient method and the Newton method to solve β, Equation (9) can be rewritten as:β If β(t) is the solution of β at step t, then I(β) can be approximated as: where g(t) and h(t) are the gradients of I(β) at β = β(t), respectively, and the Hessian matrix, adding w 1j β j to the Equation (12) and making ∂I(β) ∂β = 0 yields: where W 1 = (0, w 11 , w 12 , · · ·, w 1m ) T , Since λ 1 h −1 (t)W 1 may have some numbers less than 0, the parameters of some irrelevant variables cannot become 0. Thus, it can be directly rewritten as λ 1 W 1 . By the above inference, the solution process ofβ Daen can be derived: first generate an initial value of β, then repeat the calculation of g(t), h(t), and β(t + 1), until convergence [26].

Time Series Forecasting
Model Based on ARIMA-LSTM 2.2.1. ARIMA(p,d,q) Model By using the ARIMA(p,d,q) model, it is possible to analyze observations at past time points, depict the intrinsic link between them, and predict future values, which is achieved based on past time values and linear error equations [28][29][30][31][32]. The ARIMA model is usually denoted as ARIMA(p,d,q), where p is the number of autoregressive terms, q is the number of sliding average terms, and d is the number of differences needed to make it a smooth series [33]. The correlogram, autocorrelation function (ACF), and partial autocorrelation function (PACF) of the time series provide information about the lags [34]. If the time series is found to be smooth, the model can be used for estimation and forecasting. However, if it is not smooth, in order to apply ARIMA, it must be transformed to be smooth by differencing. After identification, an ARIMA model is estimated for a specific smooth time series. The simple ARIMA model is estimated based on the number of effective coefficients, the Bayesian information criterion (BIC) and the Akaike information criterion (AIC), and the adjusted R 2 [35]. After estimation, the selected ARIMA model needs to be diagnosed to check if the residuals are white noise. If the residuals are not white noise, the model must be re-estimated, and Q-tests and normality tests can be used to diagnose the residuals [36]. Typically, the ARIMA model is as follows: For the study, the more the number of chemical content parameters, the better the model fit, but this will be at the cost of increasing the model complexity, so the model selection should seek the best balance between the model complexity and the ability of the model to explain the data. According to the Bayesian information criterion, when the BIC is smallest, the optimal solution between the fit effect and complexity can be found [37]: T: number of samples; n: number of unknown parameters, n = p + q + 1; M: maximum likelihood number of the model. The maximum likelihood estimation process for the ARIMA(p,d,q) model is [38]: Φ 1 , Φ 2 · · · Φ p respectively represent the autolinear correlation coefficients between Y t−1 · · · Y t−p and Y t . By introducing the p − 1 term in the middle, the direct relationship between Y t−1 · · · Y t−p and Y t can be separated, and this relationship is linear. Φ 1 , Φ 2 · · · Φ p is the value to measure the size of this influence, which is the so-called PACF(θ 1 · · · θ q is the same meaning as Φ 1 , Φ 2 · · · Φ p ). ε t is the perturbed term. Where ε t ∼ iidN 0, σ 2 , The vector of total parameters is The estimation of the likelihood function for the autoregressive process is conditioned on the initial value of y, and the estimation of the likelihood function for the moving average process is conditioned on the initial value of ε. Then ARIMA(p,d,q) is conditioned on d as the difference order and the initial values of y and ε.

LSTM Model
ARIMA(p,d,q) model can well deal with the linear part of the chemical composition content in the time series, but it has certain limitations because the obtained residual series results have nonlinear characteristics, and the process of the content of some chemical components changing with the degree of weathering is a nonlinear process. This requires a deep learning model to solve the nonlinear trend of chemical composition changes [39]. The LSTM (Long Short-Term Memory) model is a deep learning model that is very good at solving nonlinear data. Its nonlinear gate unit can adjust the information flowing into and out of memory tuples at each time point so as to better fit the trend of nonlinear data changing over time.
LSTM is a special type of recurrent neural network (RNN) that performs very well with long sequences of data, mainly solving gradient disappearance, gradient explosion, and overfitting problems when training long sequences [40][41][42][43]. RNN is an artificial neural network that operates on time-series data and can use back-propagation algorithms to learn and adapt to the relationship between inputs and outputs. In contrast to standard RNN, LSTM has an input gate, a forgetting gate, and an output gate that control the way information flows through the network. These gates of the LSTM allow it to store past information and update the current state appropriately, thus providing a significant advantage when dealing with long sequences of data. The basic structure is shown in Figure 1 [44,45].
, while the input gate transforms by means of the Σ and tanh functions. The associated intermediate output is determined by the updated and the output [32].
The calculation formulas are shown in (22) to (27): , , , , and are the forgetting gate, input gate, new candidate vector, updated cell state, and output gate, respectively, and are the corresponding weight coefficient matrix and bias term, tanh, and represent the hyperbolic tangent activation function and S-shaped activation function [45]:

ARIMA-LSTM Model
In order to deal with linear and nonlinear trends in the time series data of chemical composition content before and after weathering of cultural relics, the unique advantages of the ARIMA model in dealing with linear data and the excellent performance of LSTM  [32]. The calculation formulas are shown in (22) to (27): f i , e i , C i , C i , and B are the forgetting gate, input gate, new candidate vector, updated cell state, and output gate, respectively, W f and b f are the corresponding weight coefficient matrix and bias term, tanh, and σ represent the hyperbolic tangent activation function and S-shaped activation function [45]:

ARIMA-LSTM Model
In order to deal with linear and nonlinear trends in the time series data of chemical composition content before and after weathering of cultural relics, the unique advantages of the ARIMA model in dealing with linear data and the excellent performance of LSTM in dealing with nonlinearity were used [46,47]. First, the artifact chemical content data were processed, and linear prediction results and residual series were obtained with the help of the ARIMA model. Then, the nonlinear factors of the residual series were further analyzed by the LSTM model, and the nonlinear prediction results were obtained. Finally, the linear and nonlinear prediction results were superimposed to obtain the final prediction results for the chemical composition content. According to the decomposition principle of the time series model, it is assumed that the time series Y = {y t , t = 1, 2 . . . , N} consists of linear and nonlinear components y t = x t + bx t . Therefore, the one-dimensional chemical component data are first linearly predicted by the ARIMA model to obtain the linear component x tr and the residual series δ t = y t + y tr . Then, the residual series are processed by further nonlinear prediction to obtain the nonlinear component bx tr . Finally, the linear and nonlinear components are combined to obtain the final prediction y tr = x tr + bx tr . Root mean square error (RMSE) [48], mean absolute percentage error (MAPE), and R 2 are used to evaluate the performance of the model [49][50][51][52].
R 2 is usually taken as [0,1]; the closer R 2 is to 1, the better the fit is, and the equations are as follows: In the above equation, x i is the observed value, y i is the predicted value, N is the sample size, y is the mean of y i , andŷ i is the regression fit [32].

Multiple Linear Regression Model
Multiple linear regression (MLR) is a statistical method that predicts the distribution of the dependent variable by using multiple independent factors [15]. The goal of the MLR model is to establish linear links between independent and dependent characteristics that influence a given event, and it is an extension of classical least squares regression because it employs multiple explanatory factors.
where y is the dependent variable, x 1 · · · x n are the independent variables, α 0 is the y intercept, α i is the regression coefficient of the ith independent variable, and µ i is the model error, also known as the residual. The magnitude of the coefficient of determination(R 2 )and the squared error(MSE)can be used to assess the predictive performance of the MLR model [15]: y j is the jth parameter after normalization,ŷ j is the jth parameter predicted, y j is the mean of the predicted parameters, and n is the number of samples.
We performed the BP test (Breucsh and Pagan test) and the White test on the perturbation term µ i to see if there was heteroskedasticity. If the perturbation term is correlated with the independent variable, it may make the regression coefficients of the model inaccurate, thus leading to large errors in the results.
In the BP test, it is assumed that the regression model is If H 0 is not true, then the conditional variance E ε 2 i x 2 , · · · , x k is a function of (x 2 , · · · , x k ) and is called the conditional variance function. The BP test assumes that the conditional variance function is linear: The original hypothesis can be simplified to: If we assume that H 0 is true, we can show that ε i has no correlation with the independent variable x iK ; that is, there is no autocorrelation, and the perturbation term has no heteroscedasticity. Since the perturbation term ε i is not observable, the residual squared e 2 i is used for auxiliary regression of the explanatory variable: nR 2 statistics were used: R 2 is the R 2 of auxiliary regression. The difference between the White test and the BP test lies in that when the White test carries out auxiliary regression, there are x iK square terms and cross terms in Equation (37), so the BP test can be regarded as a special case of the White test.
In addition, we tested the model for multicollinearity, and the variance inflation factor VIF was used to eliminate the influence factors with multicollinearity, which improved the accuracy of the model: Assuming that there are k independent variables, then the variance inflation factor 1−k/n is the goodness of fit obtained by regressing the n-th independent variable as the dependent variable on the remaining k − 1 independent variables; the larger the V IF n , the greater the correlation between the n-th variable and the other variables [53]. If V IF n is greater than 10, there is strict multicollinearity between the variables.

Data Pre-Processing
This study is based on the data related to ancient glassware provided by the official website of the 2022 China Student Mathematical Modeling Competition [14]. The glass sampling points are discussed separately by two systems: lead-barium glass and high-potassium glass. The data gives the proportion of the chemical composition of the sampling points of this batch of artifacts, which is characterized by composition, that is, the data of the proportion of the content of each chemical component of the cumulative sum should be found 100%, but may be due to detection means or contain various types of impurities and other reasons, resulting in the proportion of its corresponding components of the cumulative sum of the non-100% situation. Thus, in this study, the data with the sum of components between 85% and 105% were stored as valid data, and the data with severe weathering of the glass were excluded to eliminate the influence of outliers on the model results. The results are shown in Table A1.

Experimental Procedure
This paper focuses on three improved joint model algorithms, Daen-LR, ARIMA-LSTM and MLR. The experimental software environment Matlab 2021b, SPSS, Stata were used to analyze the identification of ancient glasses.
First, in this paper, the obtained pre-processed data set is used to find the relationship between the chemical composition content and weathering at its sampling points after glass classification by building an improved logistic regression model based on a double adaptive elastic net. Then, by using the ARIMA-LSTM model, we predict the content of chemical components contained in the two glasses before weathering and obtain the correlation curves of chemical components before and after weathering. Finally, based on the results obtained above, this paper uses a multiple linear regression model to predict the type of unknown glass and judges the accuracy and efficiency of the model by testing whether there is heteroskedasticity in the perturbation terms, multicollinearity, and the degree of fit between the experimental and actual values of the model. The flow chart is shown in Figure 2. components contained in the two glasses before weathering and obtain the correlation curves of chemical components before and after weathering. Finally, based on the results obtained above, this paper uses a multiple linear regression model to predict the type of unknown glass and judges the accuracy and efficiency of the model by testing whether there is heteroskedasticity in the perturbation terms, multicollinearity, and the degree of fit between the experimental and actual values of the model. The flow chart is shown in Figure  2.

Relationship between Glass Weathering and Its Chemical Composition Based on an Improved Logistic Regression Model with Double Adaptive Elastic Net
Based on the data in Table A1, we use Matlab and SPSS to conduct modeling and calculation of the Daen-Logistic Regression model. The dependent variable here is a dichotomous variable (i.e., weathered and unweathered states), and the content of various chemical components is set as the independent variable. The double adaptive elastic net model can determine the classification results of weathered or unweathered glass under different conditions for each independent variable, which can avoid the variability of the results when the independent variables are selected differently, make the classification more adaptive, and reduce the influence of strongly correlated variables on other variables. The calculation gives the following in Table 1: From the table of high potassium glass type, it can be seen that the values of two chemical components, SiO 2 and K 2 O, are relatively large, and the values of the significance p-value are less than p = 0.1, so these two chemical components have the greatest influence on whether the surface of high potassium glass is weathered or not. From the table of lead-barium glass type, it can be seen that the values of three chemical components, SiO 2 , PbO, and P 2 O 5 , are relatively large, and the values of the significance p-value are less than p = 0.1, so these two chemical components have the greatest influence on whether the surface of high potassium glass is weathered or not.

Prediction of the Chemical Content of Glass before Weathering Based on the ARIMA-LSTM Model
By solving 4.1, we obtained the results of the relationship between glass weathering and its chemical composition, and using this relationship, we screened 14 chemical elements in two respective types, high potassium and lead-barium, respectively. For high potassium glasses, we have chosen to retain both SiO 2 and K 2 O chemical components. For lead-barium glasses, we chose to retain three chemical components: SiO 2 , PbO, and P 2 O 5 ; all of them have relatively complete data and have strong correlations for modeling analysis.
In order to make the model better identify the patterns in the data, the outliers with large deviations are first eliminated. SPSS 24 software was used to detect three abnormal data values with an additive or transient state, and the existence of such outliers would lead to accidental results in the model, leading to wrong conclusions. Taking the SiO 2 content of high potassium glass and lead-barium glass as examples, the outliers of both are shown in Table 2. Through the analysis and calculation in Matlab and SPSS, we tested and fitted the data values of all the chemical composition contents changing with the time series and established the ARIMA-LSTM prediction curve model. We found that the parameters of ARIMA(2,1,0)-LSTM can obtain the maximum likelihood value of the model, and the normalized BIC [54] values of 3.160 and 4.160 for the SiO 2 component content in high potassium glass and lead-barium glass, for example, are the smallest values among the parameters. In addition, the smooth R 2 values of the model are 0.960 and 0.934, both close to 1, and both p-values are 0.000, both less than 0.05, so it can be considered that the results of the model are significantly reasonable and can fit well with the prediction model (Table 3). After the initial completion of the estimated time series model based on the chemical composition content, a white noise test of the residuals is required. If the residuals are white noise, then it can indicate that the selected model can identify the laws of the time series data, that is, the model is acceptable; if the residuals are not white noise, then it means that there is still some information not identified; at this time, the model parameters need to be revised to continue to identify this part of the information. The study used Ljung and Box's Q test to determine whether the residuals are white noise [55,56]: Assuming that the residual { t } is a white noise sequence, then ρ s = 1, s = 0 0, s = 0 , the autocorrelation coefficient of the sample, is: In H 0 : ρ 1 = ρ 2 = · · · = ρ S = 0, H 1 : ρ i (i = 1, 2, . . . , s) at least one is not 0. In the case that H 0 holds, the statistic Q = T(T + 2) s ∑ k+1 r 2 k T−k ∼ X 2 s−n , from which the p-value can be calculated, and if the p-value is less than 0.05, then the original hypothesis is rejected, indicating that the model is not fully identified and the model parameters need to be modified. Through the model statistics, the p-values of the Ljung and Box's Q test for SiO 2 content of high potassium glass and lead-barium glass are 0.889 and 0.744, respectively, both of which are greater than 0.05, i.e., we cannot reject the original hypothesis, and we can assume that the residuals are white noise sequences and the model can be fully identified. Figure 3 shows that the autocorrelation coefficients and partial autocorrelation coefficients of all lag orders are not significantly different from 0 [57,58]. Appl Through the model statistics, the p-values of the Ljung and Box's Q test for SiO2 content of high potassium glass and lead-barium glass are 0.889 and 0.744, respectively, both of which are greater than 0.05, i.e., we cannot reject the original hypothesis, and we can assume that the residuals are white noise sequences and the model can be fully identified. Figure 3 shows that the autocorrelation coefficients and partial autocorrelation coefficients of all lag orders are not significantly different from 0 [57,58]. By the same method, the fitting coefficients of all mathematical models of the measured chemical composition contents were obtained. In the category of high potassium glass, the R 2 values of SiO2 and K2O were 0.960 and 0.969, respectively. In the category of lead-barium glass, the R 2 values of SiO2, P2O5, and PbO are 0.934, 0.951, and 0.948, respectively. Finally, the corresponding prediction model curve is drawn, from which the correlation of chemical composition content before and after weathering can be clearly seen, as shown in Figure 4. The blue curve represents the actual value of chemical content changing with time after weathering, while the yellow curve represents the fitting value. The fitting degree of both represents the superiority of the model's performance. The red curve represents the predicted value of component content over time before weathering. It can be seen that the ARIMA (2,1,0)-LSTM model shows the correlation of chemical composition contents before and after weathering, reduces the interference of "weathering" factors on glass classification, and improves the accuracy of subsequent glass classification. By the same method, the fitting coefficients of all mathematical models of the measured chemical composition contents were obtained. In the category of high potassium glass, the R 2 values of SiO 2 and K 2 O were 0.960 and 0.969, respectively. In the category of lead-barium glass, the R 2 values of SiO 2 , P 2 O 5 , and PbO are 0.934, 0.951, and 0.948, respectively. Finally, the corresponding prediction model curve is drawn, from which the correlation of chemical composition content before and after weathering can be clearly seen, as shown in Figure 4. The blue curve represents the actual value of chemical content changing with time after weathering, while the yellow curve represents the fitting value. The fitting degree of both represents the superiority of the model's performance. The red curve represents the predicted value of component content over time before weathering. It can be seen that the ARIMA (2,1,0)-LSTM model shows the correlation of chemical composition contents before and after weathering, reduces the interference of "weathering" factors on glass classification, and improves the accuracy of subsequent glass classification.

Identifying Unknown Artifact Types Based on Multiple Linear Regression Model
Through the results of Sections 4.1 and 4.2, we conducted chemical content testing and analysis on a batch of newly excavated glass artifacts, as shown in Table 4, and judged the categories to which they belonged by the correlation of related elements and weathering effects. Firstly, the chemical elements with significant correlation in each category were initially screened out by the statistical law of chemical element content, and the multiple linear regression equation between elements and categories was established to find out the experimental values of categories and make errors and fits with the actual values of categories, and to verify the accuracy of the model. shown in Figure 4. The blue curve represents the actual value of chemical content changing with time after weathering, while the yellow curve represents the fitting value. The fitting degree of both represents the superiority of the model's performance. The red curve represents the predicted value of component content over time before weathering. It can be seen that the ARIMA (2,1,0)-LSTM model shows the correlation of chemical composition contents before and after weathering, reduces the interference of "weathering" factors on glass classification, and improves the accuracy of subsequent glass classification.

Identifying Unknown Artifact Types Based on Multiple Linear Regression Model
Through the results of Sections 4.1 and 4.2, we conducted chemical content testing and analysis on a batch of newly excavated glass artifacts, as shown in Table 4, and judged the categories to which they belonged by the correlation of related elements and weathering effects. Firstly, the chemical elements with significant correlation in each category were initially screened out by the statistical law of chemical element content, and the multiple linear regression equation between elements and categories was established to find out the experimental values of categories and make errors and fits with the actual values of categories, and to verify the accuracy of the model.  According to the classification rules of chemical content and surface weathering, it can be initially concluded that K 2 O, CaO, MgO, Al 2 O 3 , FeO, PbO, BaO, and P 2 O 5 have strong correlations with surface weathering and categories, while the remaining elements have weak correlations, so the remaining elements can be deleted. In addition, because the chemical element contents of the three heavily weathered glass artifacts are very different from other contents, which will have a large impact on the analysis of the model, they are treated as outliers. In the multiple linear regression equation, the qualitative data should be set as dummy variables, so the qualitative variables (unweathered and weathered) in the surface weathering independent variable Suw can be set as quantitative variables (0 and 1), and the qualitative variables (high-potassium and lead-barium) in the discriminatory category dependent variable y i can be set as quantitative variables (A and B), and the following multiple linear regression equation can be established as: Suw i = 1 denotes the i − th weathering sample Suw i = 0 denotes the i − th unweathered sample E(y | Suw = 1 and other independent variables) = β × 1 + m(Constants) E(y | Suw = 0 and other independent variables) = β × 0 + m(Constants) The joint significance test indicators for the F-statistic [59,60] for the above model results are as follows: F(10,55) represents the F joint statistic test value of 51.41, the confidence interval is 95%, and the original hypothesis H 0 is: a1 = a2 = a3 = · · · = a9 = β = 0. From Table 5, we can see that the p-value is 0, p is less than 0.05, and at this time the original hypothesis is rejected. We have reason to believe that the correlation coefficient is significantly different from 0, so we can consider this model to be useful. The regression coefficients and corresponding p-values for the variables of interest can be derived as in Table 6. Only when the p-value is less than 0.05, we consider it significant, and the regression coefficient is credible at this point, so we can use the regression coefficients corresponding to K 2 O, Al 2 O 3 , PbO, BaO, and Suw (the dummy variable "weathering"), and the larger the absolute value of the regression coefficients, the greater the effect on the dependent variable.  We can derive the multiple linear regression equation for glass artifact class, chemical element content, and weathering type as follows:

Testing for the Presence of Heteroskedasticity in the Perturbation Term
Perturbation term µ i is unobservable and requires certain conditions to be met. Our model defaults to a spherical perturbation term, which generally has to satisfy "no autocorrelation" and "homoskedasticity" because if the perturbation term is "correlated with the independent variable", i.e., endogenous, it will make the correlation regression coefficient inaccurate; if there is "heteroskedasticity", it will cause the hypothesis test statistic we constructed to be invalid, and the OLS estimator cannot be treated as the optimal linear unbiased estimator [61]. Therefore, we performed the BP test and White test on the perturbation term to verify the presence of heteroskedasticity, as shown in Table 7 [62-64]. The above two hypotheses were tested for heteroskedasticity, and the original hypothesis H0 was that there is no heteroskedasticity in the perturbation term. However, the p-value is greater than 0.05, so H0 is accepted, and we can assume that there is no heteroskedasticity in the perturbation term.

Testing for Multicollinearity
If the data matrix X does not satisfy the column rank, i.e., a variable can be linearly expressed by other explanatory variables, then there is "strict multicollinearity", Stata software was used to calculate the VIF of each variable, and the test results were as follows Table 8: It is generally believed that when VIF > 10, the regression equation has severe multicollinearity; SiO 2 and PbO both exceed 10, but the p-value of SiO 2 is higher than 0.05, which is not significant, so its coefficient is not considered in the equation model. PbO, although VIF exceeds 10, the p-value is lower than 0.05 because the coefficient is still significant with variance inflation; if there is no multicollinearity, the regression coefficients would be more significant.

Testing the Fit of the Experimental and Actual Values of the Model and Identifying the Unknown Artifact Types
Due to the fact that the dependent variable is the category of glass artifacts, there are only two categories: high-potassium and lead-barium, so it can be treated as a 1-0 variable. If the experimental value is close to 1, then it is considered the high potassium category; if the experimental value is close to 0, then it is considered the lead-barium category. From Figure 5 and Table 9, it can be seen that the 66 samples fit very well, almost no chance data occur, and the identification results of 8 unknown cultural relic types are completely consistent with reality, so it can be considered that the predicted value of this multiple linear regression equation is quite accurate. collinearity; SiO2 and PbO both exceed 10, but the p-value of SiO2 is higher than 0.05, which is not significant, so its coefficient is not considered in the equation model. PbO, although VIF exceeds 10, the p-value is lower than 0.05 because the coefficient is still significant with variance inflation; if there is no multicollinearity, the regression coefficients would be more significant.

Testing the Fit of the Experimental and Actual Values of the Model and Identifying the Unknown Artifact Types
Due to the fact that the dependent variable is the category of glass artifacts, there are only two categories: high-potassium and lead-barium, so it can be treated as a 1-0 variable. If the experimental value is close to 1, then it is considered the high potassium category; if the experimental value is close to 0, then it is considered the lead-barium category. From Figure 5 and Table 9, it can be seen that the 66 samples fit very well, almost no chance data occur, and the identification results of 8 unknown cultural relic types are completely consistent with reality, so it can be considered that the predicted value of this multiple linear regression equation is quite accurate.

Comparision of Different Models
To demonstrate the superiority of the proposed method, we compare the proposed joint algorithm with similar decision and classification algorithms like Decision Trees (DT), Random Forests (RF) [65], Support Vector Machines (SVM), Random Forests based on calssification and regression tree (CART-RF) [66][67][68][69]. We did not use any pre-trained models, but trained each model from scratch. When we select the parameters of traditional machine learning algorithm, we take into account the number of data features and avoid overfitting, as we can see in Table 10. Then we perform experimental simulations of these models to be compared as well as the model proposed in this paper using Matlab. The results are presented in the following Table 11. In the classification results, this study uses common evaluation indicators to judge the superiority of the model: Train Acc, Test Acc, Precision, Recall, and F1 Score. TP,TN,FP,FN are required to explain the above indicators, so confusion matrix is introduced, as shown in Figure 6. The specific performance is described as follows:

Comparision of Different Models
To demonstrate the superiority of the proposed method, we compare the proposed joint algorithm with similar decision and classification algorithms like Decision Trees (DT), Random Forests (RF) [65], Support Vector Machines (SVM), Random Forests based on calssification and regression tree (CART-RF) [66][67][68][69]. We did not use any pre-trained models, but trained each model from scratch. When we select the parameters of traditional machine learning algorithm, we take into account the number of data features and avoid overfitting, as we can see in Table 10. Then we perform experimental simulations of these models to be compared as well as the model proposed in this paper using Matlab. The results are presented in the following Table 11. In the classification results, this study uses common evaluation indicators to judge the superiority of the model: Train Acc, Test Acc, Precision, Recall, and F 1 Score. TP, TN, FP, FN are required to explain the above indicators, so confusion matrix is introduced, as shown in Figure 6. The specific performance is described as follows: Table 10. Selected parameters of each algorithm.

Algorithms Parameters
DT Node splitting evaluation criteria = gini Feature division point selection criteria = random Minimum samples for internal node splitting = 2 Minimum samples in leaf nodes = 1 Maximum leaf nodes = 2 Maximum depth of the tree = 15    TP (True Positive): The true value of the data is high potassium, and the predicted value is also high potassium. TN (True Negative): The true value of the data is lead barium, and the predicted value is also lead barium. FP (False Positive): The true value of the data is high potassium, but it is incorrectly predicted as lead barium. FN (False Negative): The true value of the data is lead barium, but it is incorrectly predicted as high potassium.

RF
Accuracy is the simplest and most clear index for evaluating classification models, TP (True Positive): The true value of the data is high potassium, and the predicted value is also high potassium. TN (True Negative): The true value of the data is lead barium, and the predicted value is also lead barium. FP (False Positive): The true value of the data is high potassium, but it is incorrectly predicted as lead barium.
FN (False Negative): The true value of the data is lead barium, but it is incorrectly predicted as high potassium.
Accuracy is the simplest and most clear index for evaluating classification models, but it is a good measurement standard only when the proportion of samples in each category of the data set is fairly balanced, as shown in Equation (41): Precision represents the proportion of samples that are actually positive in the predicted positive example. As shown in Equation (42): Recall represents the proportion of the actual number of positive samples in the total positive samples among the predicted positive samples. As shown in Equation (43): F 1 Score is a weighted average of accuracy rate and recall rate, which is a synthesis of both. The value of the F 1 Score determines the robustness of the model. It can be considered that the higher F 1 is, the more stable the model is. As shown in Equation (44):

Discussion
From the results of the comparison experiments, it is not difficult to see that the joint algorithm proposed in this paper shows notable advantages in all performance parameters. As shown in Table 11, in the indicators of Train Acc, Test Acc, Precision, Recall, and F 1 Score, we can find that the difference values of JMLA's performance over the past optimal algorithm model are +0.017, +0.025, +0.046, +0.035, and +0.040, respectively. We improve common machine learning algorithms and combine them with deep learning models to make the classification results more accurate, which provides a new idea for the study of the classification of ancient cultural relics.
Considering the accuracy of the JMLA algorithm in classification results and excellent evaluation indexes, this study believes that the model proposed in this paper is suitable for providing more in-depth research ideas for the classification of ancient cultural relics. The algorithm ideas in this paper can also be applied to other related fields, such as the data analysis of nutrient elements in food, the influence of air oxidation degree on nutrient elements, the classification of water pollution degree, etc. However, there is still room for improvement in the joint algorithm to address its high computational complexity and formula complexity. Compared with the existing algorithm, the calculation cost of JMLA is higher, and the formula is more complex. Further reducing algorithm complexity and better unifying the above three algorithms will be the focus of future research.

Conclusions
In this paper, we propose a joint Daen-LR, ARIMA-LSTM, and MLR machine learning algorithm (JMLA). Firstly, we combine a double adaptive elastic network with a traditional logistic model to select variables that have both Oracle and adaptive classification characteristics. These two characteristics eliminate the influence of different categories on the inconsistent selection of important independent variables and the influence of strong-correlation independent variables on the interference of weak independent variables. Secondly, we combine the deep learning model (LSTM) with the ARIMA time series model so that it can handle both linear and nonlinear trends. By calculating the ARIMA-LSTM model, we establish the correlation curve of chemical composition before and after weathering and predict the change in chemical composition with weathering. Thirdly, we combine the data processed by the above two improved methods with the multiple linear regression model to classify the unknown glass relics.
The experimental results show that the accuracy of the JMLA model on the train set is 97.9%, and the accuracy of the JMLA model on the test set is 97.6%. In addition, we compared JMLA with similar classifiers, and the results were shown in Train Acc, Test Acc, Precision, Recall, and F 1 Score indexes. The difference values of JMLA's performance over the past optimal algorithm model are +0.017, +0.025, +0.046, +0.035, and +0.040, respectively. These data show that the JMLA model has better performance than other classification models without changing the structure of similar classification models and under the same experimental conditions. The classification accuracy of the JMLA model is higher than other models, especially for large glass relics with more chemical elements and a harsh environment.
This processing method is practical and reliable in the direction related to the composition analysis and identification of cultural heritage. The application of this method is expected to improve the accuracy of the classification of cultural relics by archaeologists and can effectively reduce the impact of identification difficulties caused by factors such as harsh burial environments. It helps us to have a deeper understanding of the exchange, penetration, and development of ancient Eastern and Western cultures.
In addition, the future research directions of this study can be summarized as follows: 1. Algorithm optimization. The processing method uses a variety of machine learning algorithms that effectively combine the advantages of each algorithm with high practicality and feasibility and a good fitting effect. However, this model is only combined with an LSTM deep learning neural network, which can be combined with more advanced deep learning models in the future so as to improve the accuracy and efficiency of classification. 2.
Reduce model calculation costs and formula complexity. Although the classification accuracy of the JMLA model is very high, the calculation time is relatively long compared with other models, and the formula is relatively complex, which is also a pain point for the JMLA model. Therefore, reducing the calculation amount and better integrating the three models will be the focus of future research.

3.
Application prospects of this data processing method. In spite of its application in the direction of heritage composition analysis and identification, it is expected to be applied in the areas of health, food safety, and environmental protection, for example: analysis and classification of chemical constituents of tobacco; composition analysis of nutritional composition in food; classification and monitoring of pollutant composition in air, etc.    Note: HPWP01(1) means that the part 1 of weathering point 01 of high potassium, HPNP01-(1) means that the part 1 of non-weathering point 01 of high potassium, LBWP01(1) means that the part 1 of weathering point 01 of lead barium, LBNP01(1) means that the part 1 of non-weathering point 01 of lead barium, and LBSWP01(1) means that the part 1 of severe weathering point 01 of high potassium.