A Method of L1-Norm Principal Component Analysis for Functional Data

: Recently, with the popularization of intelligent terminals, research on intelligent big data has been paid more attention. Among these data, a kind of intelligent big data with functional characteristics, which is called functional data, has attracted attention. Functional data principal component analysis (FPCA), as an unsupervised machine learning method, plays a vital role in the analysis of functional data. FPCA is the primary step for functional data exploration, and the reliability of FPCA plays an important role in subsequent analysis. However, classical L2-norm functional data principal component analysis (L2-norm FPCA) is sensitive to outliers. Inspired by the multivariate data L1-norm principal component analysis methods, we propose an L1-norm functional data principal component analysis method (L1-norm FPCA). Because the proposed method utilizes L1-norm, the L1-norm FPCs are less sensitive to the outliers than L2-norm FPCs which are the characteristic functions of symmetric covariance operator. A corresponding algorithm for solving the L1-norm maximized optimization model is extended to functional data based on the idea of the multivariate data L1-norm principal component analysis method. Numerical experiments show that L1-norm FPCA proposed in this paper has a better robustness than L2-norm FPCA, and the reconstruction ability of the L1-norm principal component analysis to the original uncontaminated functional data is as good as that of the L2-norm principal component analysis.


Introduction
In recent years, with the rapid popularization of intelligent terminals and sensors, massive data have been rapidly accumulated, and the processing technology of intelligent big data has attracted more and more attention. Among these data, kinds of intelligent big data with function characteristics, such as physiological indicator data, growth curve data, air quality data, and temperature data, has also attracted people's attention. In fact, these data are discrete samples of a continuous function, so such data are known in the literature as functional data [1][2][3][4][5][6][7][8][9][10]. The difference between functional data and traditional multivariate data is that the former regards the observed discrete data as a whole and as a realization of a random process. Therefore, the first step of statistical analysis is to fit the discrete data into smooth curves; this can solve the problems of missing data and inconsistent sampling intervals, which are difficult issues for multivariate data. Moreover, if the fitting curve is smooth enough, we can get more information from its derivatives, which is impossible for traditional multivariate data. As a nonparametric statistical method, functional data analysis is not limited by a model and its parameters, so it can better reflect real laws in nature. At present, statistical analysis methods for functional data have been widely used in the fields of biology, medicine, economics and meteorology [11][12][13][14][15][16][17][18].
Functional data principal component analysis (FPCA), as an unsupervised machine learning method, plays a vital role in the analysis of functional data. The central idea of FPCA is to use a few orthogonal dimensions to express most of the information of the original functional data. Through dimensionality reduction, the analysis of the original functional data can be transformed into the analysis of the characteristic functions of a few dimensions, thus greatly reducing the complexity of the functional data and allowing for the better interpretation of the function data. Since J.Q. Ramsay proposed the idea of functional principal component analysis in 1991 [19], various pieces of research on functional principal component analysis have emerged one after another. Classical functional principal components are the characteristic functions of the symmetric empirical covariance operator [20]. As early as 1982, Pousse and Romain studied the asymptotic properties of the characteristic functions of the empirical covariance operator: the empirical functional principal components [21]. In order to avoid the violent oscillation of the obtained principal component weight function, Rice and Silverman (1991) proposed a smooth functional principal component estimation method that smoothed the principal component weight function by adding penalties to the variance after projection [22]. The consistency of the estimate of the smooth functional principal component was then confirmed by Pezzulli and Silverman (1993) [23]. Silverman (1996) proposed another method of smooth functional principal components. Unlike the methods of Rice and Silverman (1991), the new method achieved the smoothness of the principal component function by penalizing the norm of the projected variance [24]. Gareth (2000) studied principal component analysis for sparse function data [25]. Boente (2000) studied the functional principal components-based kernel [26] Hall (2006) studied the properties of functional principal components [27]. Benko (2009) studied common functional principal components [28], and Hormann (2015) studied dynamic functional principal components [29].
Functional data principal component analysis (FPCA) is an important research subject of machine learning and artificial intelligence, and it is the primary step for functional data exploration. Therefore, the reliability of FPCA plays an important role in subsequent analysis. The aforementioned principal component methods for functional data were established in L2-norm framework. However, because the L2-norm enlarges the influence of outliers, the traditional functional principal components analysis method is sensitive to outliers. On the other hand, in regard to multivariate data, relevant research of principal component analysis methods [30][31][32][33][34][35][36][37] has shown that the principal component analysis method of L1-norm for multivariate data has a better robustness than that of the L2-norm. In [30], Kwak (2008) proposed an L1-PCA optimization model based on L1-norm maximization for multivariable data, i.e., W L1 = argmax W∈R D×K ,W T W=I W T X 1 . The algorithm in [30] gives an approximate solver for W L1 = argmax W∈R D×K ,W T W=I W T X 1 through a sequence of deflating nullspace projections with cost O(N 2 DM), and it is robust to outliers and invariant to rotations. In [31], Nie et al. (2011) simultaneously approximated all M L1-PCs of X with complexity O(N 2 DM + NM 3 ); however, the principal components obtained by [31] were highly dependent on the the finding of the dimension M of a subspace. For example, the projection vector obtained when M = 1 may not be in a subspace obtained when M = 2. The optimal algorithm in [33] introduced a bit-flipping-based approximate solver for W L1 = argmax where d = rank(X); this solution has a low performance degradation, and is close to L2-PCA, but the cost is that it is not as robust as that in [30]. The work in [32] offered an algorithm for exact calculation W L1 = argmax W∈R D×K ,W T W=I W T X 1 with complexity O(2 NM ); however, when X is big data of large N and/or large dimension D, the cost is prohibitive. The authors of [34] studied the relationship of independent component analysis (ICA) and L1-PCA, and they proved that independent component analysis (ICA) can be performed by L1-norm PCA under the assumption of whitening. The authors of [36] computed L1-PCA by an incremental algorithm, in which only one measurement was processed at a time, and the changes in the nominal signal subspace could be tracked. Instead of maximizing the L1-norm deviation of the projected data, the authors of [35,37] focused on minimizing the L1-norm reconstruction error. However, in contrast to the conventional L2-PCA, the solutions of the minimization of the L1-norm reconstruction error might not be same as the solutions of the maximization of the L1-norm deviation of projected data. Inspired by these pieces of research on L1-PCA for multivariable data, in this paper, we try to construct a robust L1-norm principal component analysis method for functional data (L1-norm FPCA). Firstly, we build a functional data L1-norm maximized principal component optimization model, and then a corresponding algorithm for solving the L1-norm maximized optimization model is extended to functional data based on the idea of a multivariate data L1-norm principal component analysis method [30]. Numerical experiments show that the L1-norm functional principal component analysis method provides a more robust estimation of principal components than the traditional L2-norm functional principal component analysis method (L2-norm FPCA). Finally, by comparing the reconstruction errors of the L1-norm FPCA and L2-norm FPCA, it is found that the reconstruction ability of the L1-norm principal components to the original uncontaminated functional data is as good as that of the L2-norm functional principal components.

Problem Description
. Without a loss of generality, we assume that x 1 (t), x 2 (t), · · · , x n (t), t ∈ τ ⊂ R are centralized. The purpose of functional principal component analysis (FPCA) is to express as much information as possible of the original functional data with as few dimensions as possible. Firstly, the case of only one principal component is considered. At this point, the task of FPCA is to find a "projection direction" in infinite dimensional space so that the variance of projection of x 1 (t), x 2 (t), · · · , x n (t), t ∈ τ to that direction is maximum. Assuming that the projection direction is ξ 1 (t), which is called the first functional principal component weight function of functional data x 1 (t), x 2 (t), · · · , x n (t), t ∈ τ, then ξ 1 (t) should be the solution of the following optimization problem: If the information that is expressed by one principal component is insufficient, a second projection direction ξ 2 (t), which is orthogonal to the first principal component direction ξ 1 (t) and maximizes the variance of the functional data x 1 (t), x 2 (t), · · · , x n (t), t ∈ τ under the orthogonality condition, is necessary. This is the second functional principal component weight function. And so on, this process continues until the obtained principal components can express enough information. Therefore, the subsequent principal component weight functions need to satisfy the following optimization model: J.Q. Ramsay proved that the principal component weight functions ξ 1 (t), ξ 2 (t), · · · , ξ m (t) of functional data x 1 (t), x 2 (t), · · · , x n (t), t ∈ τ are the eigenfunctions that correspond to the first m largest eigenvalues of sample covariance function of functional data x 1 (t), x 2 (t), · · · , x n (t), t ∈ τ, i.e., Symmetry 2020, 12, 182 4 of 19 ∧ C(t, s)ξ i (t)dt = ρ i ξ i (s), i = 1, 2, · · · , m, where ρ 1 ≥ ρ 2 ≥ · · · ≥ ρ m are the eigenvalues of the covariance functionĈ(t, s).
From the optimization Formulas (1) and (2), it is easy to find that the above L2-norm functional principal components enlarge the influence of outliers and are sensitive to outliers. Therefore, L1-norm functional principal components are constructed in this paper. Compared with the traditional L2-norm, the L1-norm weakens the influence of outliers. It can be expected that the L1-norm functional principal components have a good anti-noise ability.

L1-Norm Functional Principal Component Analysis (L1-Norm FPCA)
Suppose x 1 (t), x 2 (t), · · · , x n (t) are the implementations of square integrable stochastic process x(·). Without a loss of generality, suppose that x 1 (t), x 2 (t), · · · , x n (t) have been centralized. Now we want to find an m-dimensional linear subspace so that the L1-norm dispersion of the projection of x 1 (t), x 2 (t), · · · , x n (t) in this subspace is the largest. Assume that the subspace is spanned by β 1 (t), β 2 (t), · · · , β m (t), and the optimization problem corresponding to Formulas (1) and (2) can be obtained: It is not easy to solve the Optimization Problem (3) because the objective function is non-differentiable, non-convex, and contains an absolute value operation. Next, we try to find the solution of Optimization Problem (3) from the perspective of orthogonal basis expansion.
Under the above assumptions, we get: the constraints β j 2 (t)dt = 1, j = 1, 2, · · · , m can be expressed as b T j b j = 1, j = 1, 2, · · · , m, and the constrains β j (t)β k (t)dt = 0 j, k = 1, 2, · · · , m; j k can be expressed as b T j b k = 0j, k = 1, 2, · · · , m; j k. Therefore, the Optimization Problem (3) can be transformed into the following Optimization Problem (4): If we can get the solution of Optimization Problem (4), according to β j (t) = b T j φ(t), j = 1, 2, · · · , m, we can get the solution of Optimization Problem (3). There are several algorithms to solve Optimization Problem (4), such as those in [30,31,33], each of which has its own advantages. According to the goal of building robust principal components for functional data, we finally choose the algorithm in [30], because the principal components calculated in [30] are more robust to outliers, and this algorithm is relatively low-complexity when the data number, data dimension, and the principal components number are large.
Next, based on the orthogonal basis expansion of functional data, we employ the L1-norm PCA algorithm of multivariate data [30] to get the solving algorithm of the L1-norm functional principal component weight functions (Abbreviation: L1-FPCA algorithm). The algorithm is rewritten in the next section.

Only One Principal Component
First, we discuss the case where there is only one principal component, namely m = 1. In this case, the Optimization Problems (3) and (4) are, respectively, simplified as follows: Next, we construct L1-FPCA algorithm to solve the Optimization Problems (5) and (6). L1-FPCA Algorithm: Step 1: Arbitrarily choose the initial projection direction β 0 (t), , and set the iteration number k to be 0.
Step 2: , and get the corresponding Step 4: and get the corresponding β k (t), then return to step 2, where ∆b is a small Theorem 1. The L1-FPCA algorithm is convergent, and its convergence point b * is the local maximum point of the Optimization Problem (6) and β * (t) is the local maximum point of Optimization Problem (5).
Proof. First, we prove that the objective function Therefore, the objective function because there are only finite number of data points, the convergence points β * (t) and b * of the L1-FPCA algorithm exist. Next, we prove that b * and β * (t) are the local maxima of the corresponding optimization problem.
Therefore, the L1-FPCA procedure finds a local maximum point b * of Since the L1-FPCA algorithm obtains a local optimal solution, we expect to find the global optimal solution with great probability by appropriately setting the initial projection direction β 0 (t), e.g., by setting β 0 (t) = argmax x 2 i (t)dt or by setting it to be the solution of L2-FPCA. In practice, we usually select several different initial projection directions β 0 (t) and calculate the respective local optimal solutions, and the solution with maximized the objective function
Step 3: For all i ∈ (1, 2, · · · , n), let c i ) and apply the L1-FPCA algorithm to c j = (c j 1 , c j 2 , · · · , c j n ) to obtain the projection vector b j and the corresponding β j (t).
Step 4: Repeat Step 3 until m projection vectors b 1 , b 2 , · · · , b m and corresponding β 1 (t), β 2 (t), · · · , β m (t) are obtained. Since b 1 , b 2 , · · · , b m are standard orthogonal dimensions in R K space [38], the principal component weight functions β 1 (t), β 2 (t), · · · , β m (t)t ∈ τ are also standard orthogonal dimensions because: As with the L2-norm functional principal component analysis, it is necessary to consider how many principal components are appropriate. This problem needs to be determined by the cumulative variance contribution rate. That is, according to the variance of the j projection direction, v j = S is more than 80% or 85%.

Simulation
In order to compare the robustness to outliers of L1-norm functional principal components (L1-FPCs) that are proposed in this paper and the classical L2-norm functional principal components (L2-FPCs), we performed this simulation. We referred to the simulation setting given by Fraiman and Muniz (2001) [38]. Here, we considered that functional data x 1 (t), x 2 (t), · · · , x n (t) are the implementations of squared integrable stochastic process X(·), and the function curves were generated from different model. There was no contamination in Model 1, and several other models suffered from different types contamination based on Model 1.
Model 3 (symmetric contamination): y i (t) = x i (t) + c i σ i M, i = 1, 2, · · · , n, where c i and M are defined as in Model 2 and σ i is a sequence of random variables with values of 1 and −1 with a probability of 1/2 that is independent of c i .
where T i is a random number generated from a uniform distribution on [0,1].
, i = 1, 2, · · · , n, where l = 1/15 and T i is a random number generated from a uniform distribution on [0, 1 − l]. Figure 1 shows the simulated curves of these five models. For each model, we set 100 equal-interval sampling points in [0,1] and generated 200 replications. For Model 1, the parameter q was 0 and the contamination constant M was 0. For several other contaminated models, we considered several levels of contamination, with q = 5% and 10% and contamination constants M = 5 and 10. When fitting function curves, we use generalized cross validation (GCV) to obtain the number of bases. The results showed that the number of bases of Model 1-3 were the same, while those of Models 4 and 5 were different. However, due to the need of calculating the change of principal component coefficient, we had to calculate it on the same basis. Therefore, for comparison purposes, in Models 4 and 5, we selected the same number of bases as that of Model 1. several levels of contamination, with q = 5% and 10% and contamination constants M = 5 and 10. When fitting function curves, we use generalized cross validation (GCV) to obtain the number of bases. The results showed that the number of bases of Model 1-3 were the same, while those of Models 4 and 5 were different. However, due to the need of calculating the change of principal component coefficient, we had to calculate it on the same basis. Therefore, for comparison purposes, in Models 4 and 5, we selected the same number of bases as that of Model 1. Classical L2-norm FPCA and L1-norm FPCA were used for the simulated functional data corresponding to these five models. We focused on their robust to various abnormal disturbances. When implementing L1-norm FPCA on Model 1, by comparing the value of objective function, the initial value was chosen as the first L2-norm functional principal component weight function, i.e.,  Classical L2-norm FPCA and L1-norm FPCA were used for the simulated functional data corresponding to these five models. We focused on their robust to various abnormal disturbances. When implementing L1-norm FPCA on Model 1, by comparing the value of objective function, the initial value was chosen as the first L2-norm functional principal component weight function, i.e., β 0 (t) = ξ(t), where ξ(t) is the eigenfunction corresponding to the largest eigenvalue of the sample covariance function of the functional data in Model 1. Because the L1-norm FPCA of the following several disturbance models should be compared with Model 1, in order to ensure the consistency of conditions when calculating the L1-norm FPCA of the following several disturbance models, the initialization values also adopted the eigenfunction corresponding to the largest eigenvalue of the sample covariance function of the corresponding functional data.     It can be seen from Tables 1-4 that under the same contamination ratio and contamination size, the coefficient changes of the principal component weight functions of the L1-norm were significantly smaller than those of the L2-norm, which shows that the functional principal components of the L1-norm were more stable than those of the L2-norm, no matter which form of contamination was received. This conclusion can also be confirmed from the boxplots of the coefficient changes of the principal component weight functions.
As can be seen from Figures 2-5, in the same contamination ratio and size, the changes of L1-norm principal component coefficient are more concentrated near zero compared with the changes of the L2-norm principal component coefficient, which shows that under the same contamination mode, L1-norm functional principal components were more robust to outliers and more reliable.
As can be seen from Figures 2-5, in the same contamination ratio and size, the changes o principal component coefficient are more concentrated near zero compared with the ch e L2-norm principal component coefficient, which shows that under the same contamin e, L1-norm functional principal components were more robust to outliers and more reliabl   From the above research, we found that the L1-norm functional principal components were more robust than L2-norm functional principal components. Thus, how can one reconstruct the original functional data with these two types of principal components? In order to study this problem, we reconstructed the original uncontaminated functional data with the same number of functional principal components of L1-norm and L2-norm under each model. The scatter plots of the coefficients of the two types of reconstructed error curves are shown in Figures 6-9.    From the above research, we found that the L1-norm functional principal components were more robust than L2-norm functional principal components. Thus, how can one reconstruct the original functional data with these two types of principal components? In order to study this problem, we reconstructed the original uncontaminated functional data with the same number of functional principal components of L1-norm and L2-norm under each model. The scatter plots of the coefficients of the two types of reconstructed error curves are shown in Figures 6-9.      In Figures 6-9, we can see that the scatter plots of the reconstruction error curve coefficients of L1-norm and L2-norm were always near the line y = x under the first three pollution models, and under peak pollution, the reconstruction error of the L1-norm was smaller than that of the L2-norm. When using the paired one-sided T-test, the p-values were found to all be close to 1, indicating that the reconstruction error curve coefficients of the L1-norm were not greater than those of the L2-norm. Thus, the reconstruction ability of the L1-norm principal components to the original uncontaminated  In Figures 6-9, we can see that the scatter plots of the reconstruction error curve coefficients of L1-norm and L2-norm were always near the line y = x under the first three pollution models, and under peak pollution, the reconstruction error of the L1-norm was smaller than that of the L2-norm. When using the paired one-sided T-test, the p-values were found to all be close to 1, indicating that the reconstruction error curve coefficients of the L1-norm were not greater than those of the L2-norm. Thus, the reconstruction ability of the L1-norm principal components to the original uncontaminated In Figures 6-9, we can see that the scatter plots of the reconstruction error curve coefficients of L1-norm and L2-norm were always near the line y = x under the first three pollution models, and under peak pollution, the reconstruction error of the L1-norm was smaller than that of the L2-norm. When using the paired one-sided T-test, the p-values were found to all be close to 1, indicating that the reconstruction error curve coefficients of the L1-norm were not greater than those of the L2-norm. Thus, the reconstruction ability of the L1-norm principal components to the original uncontaminated functional data was not worse than that of the L2-norm principal components. The results of the paired one-sided T-test are shown in Tables 5-8.    The above experiments showed that the functional principal component of the L1-norm was not just stable and reliable, it also had the same reconstruction ability as the L2-norm.

Canadian Weather Data
We used Canadian weather data, which provide daily temperatures at 35 different locations in Canada averaged over 1960-1994, in order to compare the robust to outliers of the L1-norm functional principal components and L2-norm functional principal components when the functional data were contaminated by abnormal data. Firstly, by considering the periodic characteristics of the data, the discrete temperature observation data were fitted into 35 functional curves by a Fourier basis function, and the number of the basis functions was 65. The fitting curves are shown in Figure 10a. As can be seen when using the function data outlier detection method [39], the temperature modes of the four stations of Vancouver, Victoria, Pr. Rupert and Resolute were different from those of the other stations. Figure 10b shows this function after removing the data from these four observatories. The functional data of the 35 observatories were called the whole data, and the functional data after removing Vancouver, Victoria, Pr. Rupert and Resolute were normal data, so the whole data can be understood as the addition of abnormalities to the normal data. removing Vancouver, Victoria, Pr. Rupert and Resolute were normal data, so the whole data can be understood as the addition of abnormalities to the normal data. In order to compare the robustness between the L2-norm functional principal component weighting functions and the L1-norm functional principal component weighting functions to outliers, the L2-norm functional principal components and L1-norm functional principal components were, respectively, used for normal data and data added with outliers. For each method, the results of the two cases were compared, because the variance contribution rate of the first two principal components reached 90%, though the latter analysis only focused on the first two functional principal components. Figure 11 shows the change of the first principal component weight function before and after adding outliers by using two functional principal component analysis methods. Figure 11a is a graph of the first principal component weight function that was obtained by using the L2-norm functional principal component method. The solid line is the result of normal data, and the dashed line is the result of adding four abnormal curves. Figure 11b is a graph of the first principal component weight function that was obtained by using the proposed L1-norm functional principal component method. After comparing the objective function, the initial value was chosen as the first L2-norm functional principal component weight function, i.e., is the eigenfunction corresponding to the largest eigenvalue of the sample covariance function of normal functional data and the same method for whole functional data. The solid line is the result of normal data, and the dashed line is the result of adding four abnormal curves. By comparing the coefficients of the two first functional principal component weighting functions, it was found that the sum of the absolute change of the coefficients of the first principal component weighting functions that were obtained by the L1-norm method before and after adding abnormal values was 0.16, which was less than the 0.18 corresponding to the L2-norm. Next, the performance of the second principal component weight function is discussed. In order to compare the robustness between the L2-norm functional principal component weighting functions and the L1-norm functional principal component weighting functions to outliers, the L2-norm functional principal components and L1-norm functional principal components were, respectively, used for normal data and data added with outliers. For each method, the results of the two cases were compared, because the variance contribution rate of the first two principal components reached 90%, though the latter analysis only focused on the first two functional principal components. Figure 11 shows the change of the first principal component weight function before and after adding outliers by using two functional principal component analysis methods. Figure 11a is a graph of the first principal component weight function that was obtained by using the L2-norm functional principal component method. The solid line is the result of normal data, and the dashed line is the result of adding four abnormal curves. Figure 11b is a graph of the first principal component weight function that was obtained by using the proposed L1-norm functional principal component method. After comparing the objective function, the initial value was chosen as the first L2-norm functional principal component weight function, i.e., β 0 (t) = ξ(t), where ξ(t) is the eigenfunction corresponding to the largest eigenvalue of the sample covariance function of normal functional data and the same method for whole functional data. The solid line is the result of normal data, and the dashed line is the result of adding four abnormal curves. By comparing the coefficients of the two first functional principal component weighting functions, it was found that the sum of the absolute change of the coefficients of the first principal component weighting functions that were obtained by the L1-norm method before and after adding abnormal values was 0.16, which was less than the 0.18 corresponding to the L2-norm. Next, the performance of the second principal component weight function is discussed. Figure 12 shows the change of the second principal component weight function before and after the addition of outliers by using two functional principal component analysis methods. Figure 12a is a graph of the second principal component weight function that was obtained by using the L2-norm function principal component method. The solid line is the result of normal data, and the dashed line is the result of adding four abnormal curves. Figure 12b is a graph of the second principal component weight function that was obtained by using the proposed L1-norm function principal component method. The solid line is the result of normal data, and the dashed line is the result of adding four abnormal curves. By comparing the coefficients of the two second function principal component weighting functions, it was found that the sum of absolute change of the coefficients of the second principal component weighting functions that were obtained by the L1-norm method before and after adding abnormal values was 0.33, which was less than the 0.76 corresponding to the L2-norm. the sums of the absolute values of the coefficient change of the principal component weight functions under the two methods are shown in Table 5.    Figure 12 shows the change of the second principal component weight function before and after the addition of outliers by using two functional principal component analysis methods. Figure 12a is a graph of the second principal component weight function that was obtained by using the L2-norm function principal component method. The solid line is the result of normal data, and the dashed line is the result of adding four abnormal curves. Figure 12b is a graph of the second principal component weight function that was obtained by using the proposed L1-norm function principal component method. The solid line is the result of normal data, and the dashed line is the result of adding four abnormal curves. By comparing the coefficients of the two second function principal component weighting functions, it was found that the sum of absolute change of the coefficients of the second principal component weighting functions that were obtained by the L1-norm method before and after adding abnormal values was 0.33, which was less than the 0.76 corresponding to the L2-norm. the sums of the absolute values of the coefficient change of the principal component weight functions under the two methods are shown in Table 5.  Table 9 shows that the classical L2-norm principal components weight functions greatly changed before and after removing outliers, reflecting its sensitivity to outliers. However, the L1-norm functional principal components weight functions presented in this paper had little change before and after adding abnormal values. Therefore, that the L1-norm principal component weight function proposed in this paper has a strong anti-noise ability and a good stability. We also compared the reconstruction ability of two types of principal components to normal data. The scatter plots of the coefficients of the two types of reconstructed error curves are shown in Figure 13. Table 9 shows that the classical L2-norm principal components weight functions greatly changed before and after removing outliers, reflecting its sensitivity to outliers. However, the L1-norm functional principal components weight functions presented in this paper had little change before and after adding abnormal values. Therefore, that the L1-norm principal component weight function proposed in this paper has a strong anti-noise ability and a good stability. We also compared the reconstruction ability of two types of principal components to normal data. The scatter plots of the coefficients of the two types of reconstructed error curves are shown in Figure 13. From Figure 13, it can be seen the scatter plots of the reconstruction error curve coefficients of L1-norm and L2-norm were always near the line y = x; When we performed a paired one-sided T-test for the two groups of reconstruction error curve coefficients, the t value was found to 1.0323, the degree of freedom for the t-statistic was 33, and the p-value was 0.1547, which indicates that the reconstruction error curve coefficients of L1-norm were not greater than those of L2-norm. Thus, the reconstruction ability of the L1-norm principal components to the original uncontaminated functional data was not worse than the L2-norm principal components.

Concluding Remarks
FPCA is a primary step for functional data exploration, and the reliability of FPCA plays an important role in subsequent analysis. The existing principal component methods of functional data were established in an L2-norm framework. However, because the L2-norm enlarges the influence of outliers, the traditional functional principal components analysis method is sensitive to outliers. On the other hand, in regard to multivariate data, the relevant research on the principal component analysis method [30][31][32][33][34][35][36][37] have shown that the principal component analysis method of L1-norm for From Figure 13, it can be seen the scatter plots of the reconstruction error curve coefficients of L1-norm and L2-norm were always near the line y = x; When we performed a paired one-sided T-test for the two groups of reconstruction error curve coefficients, the t value was found to 1.0323, the degree of freedom for the t-statistic was 33, and the p-value was 0.1547, which indicates that the reconstruction error curve coefficients of L1-norm were not greater than those of L2-norm. Thus, the reconstruction ability of the L1-norm principal components to the original uncontaminated functional data was not worse than the L2-norm principal components.

Concluding Remarks
FPCA is a primary step for functional data exploration, and the reliability of FPCA plays an important role in subsequent analysis. The existing principal component methods of functional data were established in an L2-norm framework. However, because the L2-norm enlarges the influence of outliers, the traditional functional principal components analysis method is sensitive to outliers. On the other hand, in regard to multivariate data, the relevant research on the principal component analysis method [30][31][32][33][34][35][36][37] have shown that the principal component analysis method of L1-norm for multivariate data has a better robustness than that of the L2-norm. Motivated by this research, in this paper, we tried to construct an L1-norm principal component analysis method for functional data. Firstly, we built a functional data L1-norm maximized principal component optimization model. Then, a corresponding algorithm for solving the L1-norm maximized optimization model was constructed based on the idea of multivariate data L1-norm principal component analysis method [30]. An extensive simulation study was conducted, and a real dataset of Canadian weather was employed to assess the robustness of the L1-norm functional principal component analysis. From the simulation study that considered different contamination configurations (symmetric, asymmetric, partial and peak), we found that the L1-norm functional principal component analysis method provides a more robust estimation of principal components than the traditional L2-norm principal component analysis method. Finally, by comparing the reconstruction errors of the L1-norm FPCA and L2-norm FPCA, it was found that the reconstruction ability of the L1-norm principal components to the original uncontaminated functional data is as good as that of the L2-norm principal components. Therefore, when functional data contain outliers, the estimation given by the L1-norm functional principal component analysis method is more reliable. The proposed L1-norm FPCA may prove to be an useful addition to functional data analysis.