Detecting Invalid Associations between Fare Machines and Metro Stations Using Smart Card Data

,


Introduction
Smart card data collected from the automatic fare collection (AFC) system (i.e., AFC data) enable many beneficial applications in the public transportation system such as collective and individual mobility analysis, system state monitoring, and operation planning and control [1]. e usefulness of these analysis applications is highly dependent on the data quality. e AFC data are collected online and in a large scale that may inevitably encounter data quality issues such as data missing and invalidity.
Data problems are prone to happen due to the following reasons: (i) Human factors: in the AFC system, the transaction records may be missing if passengers fail to tap in/ out properly.
(ii) Infrastructure failure: for example, AFC records are triggered when a passenger taps in/taps out through an entrance/exit fare machine. e malfunctioning of fare machines may lead to issues of missing data (machine fails to record or upload transactions) and invalid data (erroneous transactions). (iii) Inadequate data management. Daily data management for transportation systems is a complex practice. Missing and invalid data may happen in the process of database merging, maintenance, or system update. Among those data problems, missing and invalid data problems are the most critical and common ones. Figure 1 illustrates the characteristics of these two problems and also their difference. e missing data are cognizable and clearly identifiable via the data structure. For example, some AFC transactions may have missing data on tap-out records (empty cells). However, the invalid data are impossible to be directly recognised since the data structure is exactly the same as the valid data. Generally, the invalid data problem can be divided into two categories: data record and association errors. e data record error originates from the facility malfunctioning in the AFC system as mentioned above. e data association error occurs in the process of merging different sources of data (e.g., AFC fare machine records and station dictionaries). e data association error may come from the incomplete information inference and invalid information matching. e paper deals with the invalid data problem to detect the hidden association errors of the complete and seemingly valid data. Specifically, it aims to detect the invalid association between fare machines and stations in the AFC data. For example, fare machine 001# is located in Metro Station A, but wrongly associated to Station B in the AFC database ( Figure 2). e problem is prone to happen as the fare machines are frequently added, replaced, etc. in the metro systems, but the fare machine-station dictionary may fail to update timely. e consequences of invalid associations could be significant, e.g., under/over charging for a large amount of passengers. In addition, it is costly to fix this problem by manpower. One should manually check all the machines in metro stations to rebuild the correct association between fare machines and stations. Especially, it is impossible to manually detect such problem in the historical dataset since the fare machine distribution may not consistent with the current system.
We develop a data-driven approach, based on tensor decomposition and machine learning techniques, to automatically detect such invalid associations using AFC data, and also infer the correct association stations that a fare machine belongs to. e approach works in two steps: the tensor decomposition is utilized to extract the flow volume and travel time patterns of each fare machine. en, the isolation tree technique and NN models are designed to detect the incorrect linked fare machines and infer their correct association stations based on the extracted features from tensor decomposition. e remaining is organised as follows: Section 2 reviews the relevant studies on data quality issues, including overview of data quality problems, feature extraction techniques, and anomaly detection; the problem formulation and methodology are presented in Section 3; Section 4 reports the case study using the AFC data from a large metro system; the final section concludes the paper and discusses potential further studies and applications.

Data Quality
Problems. Data quality is one of the most important issues in big data area. Low or bad data quality is costly. For example, it is reported that bad data or poor data quality costs US businesses 600 billion dollars annually [2]. For metro systems, AFC systems collect massive transaction data of metro passengers. e literature has reported plenty of data quality problems related to AFC data. Robinson et al. [3] reported that the reasons of AFC data quality problems can be grouped into 4 categories: (1) software; (2) data; (3) hardware; (4) user. A recurrent information missing problem of the boarding station in Beijing Metro has been reported by Ma et al. [4]. Liu et al. [5] reported a time synchronisation problem of the AFC and AVL system, which causes the recorded boarding time information to be invalid in a large scale. Network, scheduling, fare table, etc. are important data stored in the AFC database. Errors in these data will lead to significant consequences. For example, the London Oyster smart card system crashed on Saturday 12th July 2008 due to erroneous data resulting in over 40,000 Oyster cards having to be replaced [6].
Although many studies deal with missing data in transportation, to the best of our knowledge, there is no study on detecting or fixing the association errors in transportation or other related areas, particularly the fare machine-station invalid association problem.

Feature Extraction Techniques.
e key idea for a datadriven detection approach is to extract the passenger flow or/ and travel time patterns between fare machines and stations.
User ID Origin Tap-in time  Destination Tap-out time   User ID Origin Tap-in time  Destination Tap-out   Feature extraction is one of the most important issues in the machine learning field. Feature extraction reduces the resources required to characterize a large set of data or/and a huge dimensions of input information. Plenty of methods are proposed in the machine learning community dealing with the feature extraction. ese methods can be roughly divided into two parts: conventional statistical learning methods and deep learning-based method. Conventional statistical learning methods such as principle component analysis (PCA) [7], Isomap [8], and partial least squares (PLS) regression [9] mainly based on the statistical learningbased algorithms. e advantages of these methods are they are robust to small dataset, i.e., do not need large amount of samples to maintain the performance. However, the disadvantages are also critical. For example, they are not robust to noisy samples, and the feature extraction quality is highly dependent on specific tricks in different tasks, thus which are less generalized. Deep learning-based feature extraction methods become more and more popular recently. Variety forms of neural networks, e.g., convolutional neural network (CNN) [10] and long short-term memory (LSTM) [11] neural network. can be treated as feature extraction models. Different from the statistical learning-based algorithms, they extract the features in a latent, end-to-end manner. e advantage is that the extracted features are more representative and comprehensive. However, these models always require a large dataset in the training procedure; thus, they are not suitable in the few-shot scenario. In conclusion, there is no a generalized feature extraction method for all the tasks. Feature extraction methods should be designed based on the characteristics of the focused problem.
In our problem, passenger flow and travel time patterns are related to multiple modes, e.g., time and location. Tensor is a nature choice to represent and capture these patterns. Tensor is a multidimensional extension of matrix [12]. Tensor has been widely used in transportation area to deal with multidimension data. Tan et al. [13] utilized a tensor decomposition approach to capture the multimode correlations in traffic data and recover missing traffic data by reconstructing the traffic flow tensor. e results show that the proposed algorithm performs well even when the missing ratio is high. Chen et al. [14] proposed singular value decomposition (SVD)-combined tensor decomposition framework to complete the traffic data using traffic speed information. Sun and Axhausen [15] utilize a probabilistic tensor decomposition method to mine the urban mobility patterns. Mobility patterns of different passenger groups (e.g., students, adults, and elders) are explored. In our study, we also use tensor decomposition to extract the flow pattern related to each fare machine.

Anomaly Detection.
e invalid associations (between fare machines and stations) are treated as anomalies. Anomaly detection is an important topic in data mining. e anomaly detection could be roughly divided into three categories, statistical, machine learning, and deep learning models.
(1) Statistical method: statistical methods are the early explorations of the anomaly detection. e methods in this category first make assumptions of the distribution of the studied dataset. e samples with low probabilities are treated as anomalies. Rousseeuw and Driessen [16] proposed an anomaly detection method based on the Gaussian assumption of the data. e performance of statistical anomaly detection methods highly depends on the fitting between the assumption and the reality, thus exhibiting limited performance. (2) Machine learning-based methods: the most widely used anomaly detection methods are the machine learning-based methods, which generally have two categories: supervised and unsupervised methods. Supervised methods [17,18]   Journal of Advanced Transportation 3 labeled with "nominal" or "anomaly." e models are trained with the labeled data and use to identify new instances. Unsupervised methods deal with the dataset without labels. ese methods automatically detect the anomalies based on certain criteria. Popular unsupervised methods include LOF [19], DBSCAN [20], k-means [21], and the isolation forest [22] method. (3) Deep learning-based methods: the emerging deep learning models bring new opportunities to better solve the anomaly detection problem. Hundman et al. [23] propose an LSTM network-based framework for anomaly detection; [24] utilized a generative adversarial network (GAN) to detect the anomalies in time series data. Nguyen et al. [25] detect the anomalies by constructing the model snapshot and outputting the ensembles of the NN models. Deep learning-based methods tend to have more a promising performance compared to other techniques. However, these methods require a large amount of training data to produce reasonable results. Its performance is low in scenarios with a small set of training data, e.g., the fare machine-station association problem studied in this paper.

Problem Formulation.
Let m be a fare machine, and S m , S m ∈ Δ its actual station and current association station in the AFC dataset, respectively, where Δ � S 1 , S 2 , . . . , S s contains all the stations in the metro system. Note that different fare machines could share the same station, i.e., located in the same station. If S � S, fare machine m is defined as valid association fare machine; if S ≠ S, fare machine m is defined as invalid association fare machine. e fare machine-station association detection problem is defined as follows.
Given an AFC dataset D and a set of fare machines Φ recorded in D, detect invalid association fare machines and infer their associated stations for fare machines m in Φ.
Mathematically, the problem is defined as follows:

Fare Machine Features.
For convenience, we define the concept of fare machine-related passenger flow (MRF). For an entrance fare machine, MRF refers to the passenger flow tapping in an entrance fare machine of the origin station and tapping out at a destination station (using any machine) during a certain time slot. For an exit fare machine, MRF represents the passenger flow tapping in at an origin station (using any machine) and tapping out at an exit fare machine during a certain time slot. MRF can be characterized using different features, such as flow volume and travel time.
Indicators extracted from the MRF features can be used to characterize fare machines. e hypothesis is that MRF features share more similar patterns if the fare machines are located at the same station than at different stations. e flow volume and travel time are selected to characterize the MRFs of fare machines. ese two features reflect system dynamics from both the demand (mobility patterns) and supply (network and operations) points of view as well as their interactions. ey provide complementary knowledge and therefore give a more comprehensive view of the MRF patterns. ey are defined for entrance and exit fare machines separately: (i) For entrance fare machines, MRF flow volume measures the number of passengers passing through each fare machine at an origin station and going to a destination station. For exit fare machines, it represents the number of passengers entering the metro system at an origin station and tapping out through an exit fare machine. MRF flow volume reflects the mobility behavior of passengers. (ii) MRF travel time indicates the average travel time from a fare machine to a destination station for entrance fare machines and from an origin station to a fare machine for exit fare machines. It reflects the supply characteristics of the metro system, e.g., geographical relationship between stations and scheduling, but also demand characteristics of certain stations as it includes time waiting to board a train under capacity constraints. Figure 3 shows the overview of the proposed framework. It consists of three modules: MRF feature extraction, invalid association detection, and associated station inference: (i) MRF feature extraction module: it constructs the MRF flow volume and travel time tensors to characterize fare machines and extracts latent MRF flow and travel time features using the tensor decomposition technique. (ii) Invalid association detection module: it detects the invalid associations (between fare machines and stations) in two steps. e valid and invalid associations are initially detected using the isolation forest method. en, the invalid associations are reinspected (the feedback arrow) using neural networks (trained with the valid association data). (iii) Association station inference: it infers the station that a fare machine (detected as invalid association) belongs to using the trained neural networks.

MRF Tensor Construction.
For data representation, tensors are used to characterize the MRF flow volume and travel time. A tensor is a high-order generalization of a matrix. e multiway property of a tensor fits the nature of MRF features. For example, MRF flow volume can be characterized by "machine mode" (M), "time mode" (T), "day mode" (D), and "station mode" (S). For entrance fare machines, "machine mode" denotes the related fare machine ID, "time mode" represents the time interval of a day (e.g., 6: mode" denotes a destination station ID. For exit fare machines, the definitions of tensor modes are the same with entrance fare machines, except for the "station mode." e "station mode" of an exit fare machine is the origin station ID. In this way, two 4-way tensors are used to represent the MRF flow volume of entrance and exit fare machines, respectively. For example, an entry: 50 at (A, 8:00 to 9:00 AM, January 1, B) of entrance machine tensor represents "the passenger flow volume passing through entrance machine A in the interval 8:00 to 9:00 AM on January 1 and exiting at Station B is 50 passengers." e methodology for fare machine-station association is the same for entrance and exit fare machines. Entrance fare machines are used to illustrate the proposed framework. Unless stated, the "fare machines" and "MRF tensors" refer to entrance fare machines and entrance MRF tensors, respectively.
To construct the MRF flow volume tensor, the mode variables above are transformed into numerical indices:   Unfortunately, nonobservation cells always account for a large ratio of the MRF travel time tensor (e.g., 63.5% in the studied AFC dataset). erefore, it is hard to estimate a reasonable average travel time for each cell based on limited information. Instead, "NaN" values are used to fill those cells to represent the unknown travel times.

Tensor Decomposition.
Tensor decomposition is used to extract fare machine features from the MRF flow volume and travel time tensors. Given the different properties of these two tensors, different tensor decomposition methods are developed to extract the MRF flow volume and travel time features, respectively.

Tensor Decomposition of MRF Flow Volume.
For MRF flow volume tensor V, the CANDECOMP/PARAFAC (CP) decomposition [12] is used to extract the fare machine features. CP decomposition factorizes a tensor into a summation of a series of rank-1 tensors. A rank-1 tensor V ∈ R I 1 ×I 2 ×···×I n (I i is the dimension of mode i) is an outer product of N vectors: X � a (1) ∘ a (2) . . , a (n) i n , a (i) denotes a vector, a (i) k denotes the k th element of a (i) , and the symbol ∘ denotes the outer product of vectors. e CP decomposition of V ∈ R M×T×D×S can be formulated as follows: where R represents the total number of components, m r ∈ R M , t r ∈ R T , d r ∈ R D , and s r ∈ R S represent the component vector of the machine, time, day, and station modes, respectively. Figure 5 illustrates the process of CP decomposition of V.
Computing the CP decomposition of V can be treated as an optimization problem. e goal is to find a CP decomposition V � R r�1 m r ∘ t r ∘ d r ∘ s r with R components that could best approximate V. e decomposition V is the solution of the following optimization problem, i.e., find where ‖ · ‖ F denotes the Frobenius norm. is optimization problem can be solved using the alternating least squares (ALS) method [26]. Details of the solution procedure can be found in [12].

Tensor Decomposition of MRF Travel
Time. CP decomposition cannot be applied directly to extract travel time features. is is because the travel time tensor has nonnumerical (i.e., NaN) entries, which makes the operation T − T infeasible. A variation of CP decomposition, CP Weighted OPTimization (CP-WOPT) [27], is used to deal with the MRF travel time tensor decomposition. CP-WOPT is widely used to recover tensors with missing entries. CP-WOPT utilizes a weight tensor to indicate the location of NaN entries. e formulation is as follows: e weight tensor W ∈ R M×T×D×S has the same shape as T and is defined as In the initialization phase, NaN cells are filled with random values. As these values are multiplied by 0 during the optimization, they do not influence the results of the optimization objective (optimal solution). After optimization, T * can represent features of the observed travel time

Fare Machine-Station Association.
As fare machines at the same station share similar surrounding Point of Interests (POIs), the MRF features of these fare machines tend to be similar. erefore, we should first extract the MRF feature of each station. en, the MRF feature of each machine is compared to the station MRF feature. If a fare machine has a similar MRF feature with a station, then this station is likely to be the association station of the fare machine. We divide the inference process into two successive problems P1 and P2.

P1: Invalid Association Fare Machine Detection.
To solve P1, we first give two assumptions: (1) the MRF features of the invalid associations are anomalies to their recorded stations. More formally, let C(·) be the count function, anomaly means is indicates that the number of fare machines with association station S 1 but recorded station S 2 should be far less than the number of valid association fare machines in S 2 . Note that this assumption does not mean the total number of invalid association fare machines of S 2 is less than the valid fare machines. We only restrict that fare machines recorded as S 2 but actually associated with S 1 should be minority to S 2 . is assumption is reasonable since the error leads to fare machine-station invalid association tends to be random; for example, it is unlikely to have many fare machines located in the same station wrongly recorded as another station simultaneously. (2) e invalid associations happen randomly. is assumption indicates that for a fare machine m in station S, it experiences equal probability being wrongly associated to all the other stations in the system.
is assumption is reasonable since the invalid associations mainly because of the inadequate data management in the process of database merging, maintenance, or system update.
Based on this assumption, the isolation forest method is adopted to solve P1. e isolation forest model is an unsupervised model for anomaly detection, which could be directly used for the contaminated dataset.
e only requirement of this method is that the outlier should be few and different with the normal instances. is exactly fits the aforementioned assumption. e isolation forest detects the outliers using a special measurement: partitions. e isolation forest "isolates" observations by randomly selecting a dimension of the MRF feature vector and then randomly splitting the space between the maximum and minimum values of the selected dimension. Since recursive partitioning can be represented by a tree structure, the number of splittings required to isolate an MRF feature is equivalent to the path length from the root node to the terminating node.
is path length, averaged over a forest of such random trees, is a measure of normality. Random partitioning produces noticeably shorter paths for anomalies. Hence, when a forest of random trees collectively produce shorter path lengths for particular fare machines, they are highly likely to be anomalies [22].
Based on the results from the isolation forest, we can divide the fare machine MRF feature vectors into two parts: F ϕ contains all the MRF feature vectors that are inferred as invalid (i.e., abnormal) by the isolation forest F ϕ contains all the MRF feature vectors that inferred as valid (i.e., normal) by the isolation forest e fare machines with their MRF features in F ϕ are detected as valid, while the fare machines in F ϕ are reinspected in the process of solving P2.

P2: Association Station
Inference. In P2, a reinspection of the fare machines in F ϕ is conducted to refine the detection results from P1. e reinspection detects which associations are wrongly detected as invalid in F ϕ . In practical applications, the inference provides a certain sense about the data quality in their AFC database. e model outputs the potential association stations of the detected invalid association fare machines, which facilities effective field investigation and reduces manpower.
Neural network (NN) is used to model the station MRF feature using the MRF features in F ϕ (detected as valid). As the number of samples (i.e., fare machines) are limited (e.g., 2000 fare machines in the studied network), the NN training may face underfitting issues. We built one shallow neural network for each station, which denotes as the station-NN. For a certain station-NN N i of station S i , we label the fare machines with the recorded station S i in F ϕ as 1 and label other fare machines in F ϕ as 0. It is inadequate to directly train the station-NN with the labeled features. Since a metro system has many stations (e.g., 90 stations in the studied metro system), for one certain station, the number of positive samples (i.e., MRF features labeled as 1) is much less than the negative samples (MRF features labeled as 0), which will lead to the learning bias. We utilize the adaptive synthetic sampling (ADASYN) [28] approach to oversample the positive samples, ensuring that the number of the oversampled positive samples is similar with the number of negative ones. N i is then trained with the oversampled MRF features and their corresponding labels. After N i is well- trained, the output of the network will be the probability that the input fare machine MRF feature belongs to this station. For an MRF feature v in F ϕ , we input it into all the welltrained station-NNs. Let P � [P 1 , P 2 , . . . , P S ] denote the output probability from each station-NN, and P π � [P π (1) , P π(2) , . . . , P π(S) ] is the descend order permutation of P, where P π(i) > P π(j) , given i < j. e top-k stations K � π(1), π(2), . . . , π(k) { } would be the most possible association station of the corresponding fare machine of v.
Using K, the reinspection for P1 is conducted for the fare machine in F ϕ with the following rule: given a fare machine m S⟶s ∈ F ϕ , if s ∉ K, m is inferred as invalid, otherwise as valid. For the fare machines inferred as invalid after the reinspection, the top-k station K is treated as the potential association stations set. In the implementations, one can first check the stations in this set to find if this fare machine is there.

Case Study
We utilize AFC data from an urban metro system to evaluate the proposed detection and inference approach. e data cover 7 days from January 15 to 21 in 2018. e fare machine-station association information is carefully checked to ensure its validity for benchmarks. Figure 6 illustrates the statistic of the number of machines in the metro system during the studied time span.

Experimental Setup.
We randomly select 1000 entrance fare machines and 1000 exit fare machines and collect the corresponding AFC transaction records to construct the experimental dataset. We randomly choose a set of fare machines and modify their associated stations (invalid associations). e proposed approach is validated with the ratio of invalid associated fare machines ranging from 5% to 40%. e approach runs 20 times per scenario to avoid random errors. Table 1 summarizes the model parameters used in the experiments. Table 2 shows the tabularised relations between truth/falseness of the detection and valid/ invalid association.

Performance Evaluation.
A set of performance metrics is used to comprehensively evaluate the model performance, including accuracy (Accu), true positive rate (TPR), and false positive rate (FPR): where N PN is the total number of associations (or fare machines) and N T the number of correctly detected associations (between fare machines and stations). e correctly detected fare machines include cases that are truly positive and negative: where N TP is the number of truthfully detected invalid association (correctly inferred an invalid association as invalid), and N P is the number of invalid associations. TPR measures the model's sensitivity towards invalid associations: where N FP is the number of falsely detected valid associations (falsely inferred a valid association as invalid) and N N is the total number of valid associations. FPR measures the misjudgement rate of the valid associations. Figure 7 shows the detection results of associations with the invalid association ratio ranging from 5% to 40%. e results indicate that the isolation forest model is robust to the invalid associations when the invalid association ratio is less than 20% (the detection accuracy is over 96%). It can still achieve a detection accuracy of 87%, and even 40% of the fare machines are wrongly associated with stations in the data. e TPR is an essential characteristic of the detection of invalid associations in P1, since there is no reinspection of the invalid associations in F ϕ in the following procedures of the approach. at is, the wrongly associated fare machines in F ϕ will remain undetected which may eventually impact practical applications in reality. Also, it is favorable to detect more invalid associations to ensure a clean MRF feature set for each station, which benefits the correction of invalid associations in P2. e TPR is over 90% when the invalid association is less than 20%, which indicates the promising performance of the proposed approach in detecting the invalid associations. e falsely detected valid associations (FPR) is very low (less than 5%), and it decreases with the increase of the invalid association ratio as expected.

Evaluation of Association Inference (P2).
For the P2 evaluation (rematching wrongly associated fare machines to stations), we quantify the model's capability to effectively where N c is the number of fare machines in F * ϕ with their matched station contained in [π(1), π(2), . . . , π(k)] and N F * ϕ is the number of fare machines in F * ϕ . Table 3 summarizes the model performance of P2 with varied levels of invalid association ratios in the dataset. e results show that the top-k accuracy exceed 90% when k is greater than 3, regardless the invalid association ratio. It indicates that the top 3 inferred stations from the    Optimizer Adam (Adam refers to the optimization algorithm proposed in [29]) Number of neurons (16,5) model are highly likely to include the correctly associated station of the studied fare machine. is provides an important implication for further field investigations to these probable stations in practice, i.e., checking the most likely stations that the invalid associated fare machines may belong to.

Latent Feature
Analysis. e foundation of the detection or inference model being effective is the quality of the MRF features. at is, the fare machines at different stations are preferable to have significantly different MRF features. To explore the feature quality, we utilize the principle component analysis (PCA) [7] to reduce the dimension of the MRF feature vector to two. We randomly choose 5 stations in the studied metro system, select one station as the reference station, and compare its MRF feature vector to that of the other 4 stations, respectively. Figure 8 shows the MRF feature visualization results. e results show that the MRF features between stations exhibit significant differences, which indicates a high quality. is benefits the model to formulate relatively distinct MRF feature for each station, thus which is effective to detect the invalid associations and infer the associated station of the fare machines. For different stations, the MRF feature of fare machines appears different patterns. For example, the MRF features of machines in Station E (Figure 8(d)) are very similar to each other, while the MRF features of Station B (Figure 8(a)) appear a distributed manner. e reason partly lays in the different layout of the stations. For some large stations (e.g., transfer stations in the commercial center), there are many gates entering/exiting the stations, which may lead to variances in travel time between the same OD pairs. It would be the main reason for the miss and wrongly detection of the proposed model.

Conclusion
Ensuring data quality is essential for its effective use in practice. e paper proposes a model to detect the invalid data in the AFC dataset, caused by the erroneous association between fare machines and stations (e.g., due to delayed updating dictionaries or incorrect data merging). It combines tensor decomposition, isolation forest, and NN methods to detect the invalid associations in the recorded dataset and infer the correct association station that a fare machine belongs to. e model is validated using the AFC data in a busy metro system. e experiment results show that the invalid association can be detected with more than 90% accuracy when the invalid association ratio is low. Also, the model is robust to invalid associations and it can still achieve 69.62% accuracy in the extreme case when the invalid association ratio is 55%. e association station inference results indicate that the top 3 inferred stations from the model are highly likely to include the correctly associated station of the studied fare machine (around 90%). is provides an important implication for further field investigations to these probable stations in practice.
e proposed model provides useful knowledge for the AFC data management in terms of data quality check and fixing invalid data. ough the study focuses on the invalid data detection problem, the model is general and can be generalized to inference applications, e.g., inferring the alighting stations for the bus system having only the boarding records. As the extracted MRF features are meaningful, further studies could focus on the analysis based on the MRF features, for example, analysing the different utilization of fare machines in different gates of the same station to improve the infrastructure efficiency.

Data Availability
e AFC data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.