An Efficient Prediction Model on the Operation Quality of Medical Equipment Based on Improved Sparrow Search Algorithm-Temporal Convolutional Network-BiLSTM

Combining medical IoT and artificial intelligence technology is an effective approach to achieve the intelligence of medical equipment. This integration can address issues such as low image quality caused by fluctuations in power quality and potential equipment damage, and this study proposes a predictive model, ISSA-TCN-BiLSTM, based on a bi-directional long short-term memory network (BiLSTM). Firstly, power quality data and other data from MRI and CT equipment within a 6-month period are collected using current fingerprint technology. The key factors affecting the active power of medical equipment are explored using the Pearson coefficient method. Subsequently, a Temporal Convolutional Network (TCN) is employed to conduct multi-layer convolution operations on the input temporal feature sequences, enabling the learning of global temporal feature information while minimizing the interference of redundant data. Additionally, bidirectional long short-term memory (BiLSTM) is integrated to model the intermediate active power features, facilitating accurate prediction of medical equipment power quality. Additionally, an improved Sparrow Search Algorithm (ISSA) is utilized for hyperparameter optimization of the TCN-BiLSTM model, enabling optimization of the active power of different medical equipment. Experimental results demonstrate that the ISSA-TCN-BiLSTM model outperforms other comparative models in terms of RMSE, MSE, and R2, with values of 0.1143, 0.1157, 0.0873, 0.0817, 0.95, and 0.96, respectively, for MRI and CT equipment. This model exhibits both prediction speed and accuracy in power prediction for medical equipment, providing valuable guidance for equipment maintenance and diagnostic efficiency enhancement.


Introduction
Medical equipment plays a vital role in supporting and ensuring the order and quality of diagnosis and treatment in hospitals [1].In particular, large-scale medical equipment, such as computed tomography (CT) and magnetic resonance imaging (MRI), has greatly improved the level of medical care [2].However, this medical equipment, embedded within complex operating systems and components sensitive to factors such as power quality, can cause distortions in medical images and even equipment failures because of fluctuations in active power, frequency interference, and electromagnetic interference, thereby affecting clinical diagnosis and treatment [3].The occurrence of downtime in medical equipment can lead to unpredictable medical accidents and economic losses.Therefore, it is imperative for hospitals to ensure the reliability and operational excellence of their medical equipment.This is crucial to prevent interruptions, shutdowns, and other malfunctions that could compromise patient safety and the quality of diagnosis and treatment.
Currently, hospitals primarily enhance the reliability and efficiency of medical equipment through manual inspections and routine maintenance aligned with manufacturers' guidelines.Nevertheless, as medical technology evolves and the quantity of medical equipment in contemporary hospitals surges, this approach is leading to inefficiencies, escalating costs, and delayed response times.Moreover, such traditional maintenance methods are often inadequate for predicting equipment abnormalities or sudden failures [4].At the same time, the development of emerging technologies, such as the Internet of Things (IoT) and artificial intelligence (AI) is driving the innovation of medical equipment maintenance techniques [5][6][7].IoT technology enables real-time monitoring of various electrical quality parameters (such as voltage, current, and active power) of medical equipment during operation and standby, as well as environmental characteristics (such as temperature and humidity) of the equipment's working space by deploying sensors.The data from the equipment is then transmitted to a cloud platform through wireless technologies, such as Wi-Fi or cellular networks (4G/5G), and wired methods, such as ethernet or optical fiber.Finally, predictive models for medical equipment's power quality are established based on artificial intelligence techniques such as machine learning and deep learning [8,9].Machine learning-based predictive maintenance [10][11][12][13] strategies have already seen widespread implementation across various industries, including the mechanical and power sectors.These innovative approaches leverage data analysis and predictive algorithms to anticipate equipment failures before they occur, improving efficiency and reducing downtime.Dai et al. [14] introduced an intelligent diagnostic technique utilizing a multiscale gated convolutional neural network (MGCNN) specifically for diagnosing wear in sliding bearings.This approach offers robust support for monitoring the condition of equipment and enhances the capabilities of predictive maintenance.Lin et al. [15] enhanced the operational performance of power lines by employing advanced algorithms, such as support vector regression (SVR), gradient boosting regression (GBR), and long short-term memory neural networks (LSTM), which were instrumental in predicting leakage current in insulators with greater accuracy.However, predictive maintenance strategies for medical equipment are still in the early stages of development.Wang [3] proposed a multivariate time series classification model to predict abnormal conditions in CT equipment, using a sliding time window technique, which constructs new features over different time intervals.This approach helps reduce the interference of other factors on the predictive results.Spahić et al. [16] proposed a method based on artificial neural networks and fuzzy logic classifiers to predict the performance of infant incubators for upgrading medical equipment management strategies in healthcare institutions to address challenges of increased sophistication of equipment as well as patient safety demands.
The accurate prediction of active power of medical equipment is of great significance for identifying the operating status of the equipment.Due to the non-linear, dynamic, and uncertain characteristics of active power output in medical equipment, a combination of neural networks and optimization algorithms is necessary to enhance prediction accuracy [17].To precisely identify the operating status of medical equipment and advance the development of predictive maintenance strategies for medical equipment, this paper introduces a medical equipment active power prediction model based on ISSA-TCN-BiLSTM.The input features play a crucial role in determining prediction accuracy [18].The Pearson coefficient method is utilized to select key factors that influence changes in active power, enabling the model to effectively learn the underlying relationships within the data.Moreover, to overcome TCN-BiLSTM's vulnerability to local optima when dealing with a large number of features in predicting active power for medical equipment, an enhanced Sparrow Search Algorithm is employed to search for optimal hyperparameters, thereby improving the reliability and stability of the predictive model.

Prediction Methods for Equipment Power
An active power prediction model for medical equipment is developed utilizing the improved Sparrow Search Algorithm (ISSA) and the TCN-BiLSTM algorithm.This model comprises four key components: data preprocessing, Pearson correlation analysis, data prediction, and hyperparameter optimization.The complete prediction workflow is depicted in Figure 1.
points, maintaining the integrity of the dataset.Subsequently, the classification sample set is normalized to mitigate the impact of varying dimensions on the outcomes.Pearson correlation coefficients are used to pinpoint and choose features that significantly affect active power, thereby enhancing the quality of the training data for the active power prediction models.Finally, the refined training set is inputted into the TCN-BiLSTM model for training, with the ISSA being utilized to fine-tune the model's hyperparameters.

TCN-BiLSTM
TCN is a time series data processing method based on convolutional neural networks [19].Its main structure consists of a residual block containing dilated causal convolution.The residual block of TCN is illustrated in Figure 2.
TCN integrates residual blocks with dilated causal convolution, built on a foundation of one-dimensional convolution, to enhance the extraction of temporal features while also Initially, intelligent equipment, such as dynamic energy identifiers and dynamic environment sensors, can gather power quality data from medical equipment and environmental information from within the equipment rooms.In order to improve the prediction accuracy of the model, this paper uses the K-nearest neighbor (KNN) method to complete the missing data and the pauta criterion to eliminate the abnormal data.KNN is a powerful technique that allows us to impute missing values based on the similarity of data points, maintaining the integrity of the dataset.Subsequently, the classification sample set is normalized to mitigate the impact of varying dimensions on the outcomes.Pearson correlation coefficients are used to pinpoint and choose features that significantly affect active power, thereby enhancing the quality of the training data for the active power prediction models.Finally, the refined training set is inputted into the TCN-BiLSTM model for training, with the ISSA being utilized to fine-tune the model's hyperparameters.

TCN-BiLSTM
TCN is a time series data processing method based on convolutional neural networks [19].Its main structure consists of a residual block containing dilated causal convolution.The residual block of TCN is illustrated in Figure 2.
TCN integrates residual blocks with dilated causal convolution, built on a foundation of one-dimensional convolution, to enhance the extraction of temporal features while also tackling the issue of vanishing or exploding gradients.Causal convolution is crucially utilized to ensure that predictions rely solely on past inputs, preventing future information leakage.Dilated convolution introduces interval sampling of inputs during convolution operations, effectively addressing the challenge of extracting information from multivariate time series data and expanding the receptive field.Additionally, employing the Rectified Linear Unit (ReLU) activation function and dropout techniques can significantly reduce overfitting, thereby improving the network's learning speed and prediction accuracy.
leakage.Dilated convolution introduces interval sampling of inputs during convolution operations, effectively addressing the challenge of extracting information from multivariate time series data and expanding the receptive field.Additionally, employing the Rectified Linear Unit (ReLU) activation function and dropout techniques can significantly reduce overfitting, thereby improving the network's learning speed and prediction accuracy.The long short-term memory (LSTM) [20] neural network is a specialized variant of the recurrent neural network (RNN) [21], designed specifically to overcome the challenges of vanishing and exploding gradients that conventional RNNs encounter while processing sequential data.LSTM networks are widely used in time series scenarios that require long-term interval and time delay prediction.
LSTM network enhances the standard RNN framework by incorporating additional layers, which include memory cells and three distinct gates: the input gate, forget gate, and output gate.This structure enables selective information flow, with each gate performing a unique role in regulating the passage of data.
The forget gate is responsible for determining which information to discard from the storage unit.The calculation formula is as follows: (1) In the equation, , , , and respectively denote the weight coefficients and bias for the forget gate.
The input gate is responsible for deciding which information to retain in the memory cell.The calculation formula is as follows: (2)  The long short-term memory (LSTM) [20] neural network is a specialized variant of the recurrent neural network (RNN) [21], designed specifically to overcome the challenges of vanishing and exploding gradients that conventional RNNs encounter while processing sequential data.LSTM networks are widely used in time series scenarios that require long-term interval and time delay prediction.
LSTM network enhances the standard RNN framework by incorporating additional layers, which include memory cells and three distinct gates: the input gate, forget gate, and output gate.This structure enables selective information flow, with each gate performing a unique role in regulating the passage of data.
The forget gate is responsible for determining which information to discard from the storage unit.The calculation formula is as follows: In the equation, w f x , w f h , w f c , and b f respectively denote the weight coefficients and bias for the forget gate.The input gate is responsible for deciding which information to retain in the memory cell.The calculation formula is as follows: In the equation, w ix , w ih , w ic , and b i respectively denote the weight coefficients and bias for the forget gate, and w cx , w ch , and b O respectively denote the weight coefficients and bias for the forget gate.
The output gate determines which information will be outputted.No other information can pass through the output gate besides what is required.The update equation is as follows: C t represent the candidate vector and the update value of the candidate vector at time t, respectively.The outputs at times t and (t − 1) are represented by h t and h t−1 , respectively.
BiLSTM [22] enhances the network's capacity to capture temporal sequence information by overlaying two independent LSTM layers, enabling the simultaneous acquisition of both past and future contextual information.The structure of BiLSTM is illustrated in Figure 3.
as follows: (5) (6) In the equation, , and respectively denote the input gate, forget gate and output gate, and represents the input at time t.The sigmoid activation function is represented by , and represents the hyperbolic tangent activation function.The weight coefficients of the output gate are denoted by , , , and and represent the candidate vector and the update value of the candidate vector at time t, respectively.The outputs at times t and (t -1) are represented by and , respectively.BiLSTM [22] enhances the network's capacity to capture temporal sequence information by overlaying two independent LSTM layers, enabling the simultaneous acquisition of both past and future contextual information.The structure of BiLSTM is illustrated in Figure 3.The forward and backward LSTM networks compute sequence information for forward and reverse input, respectively, yielding hidden state outputs of and , which are then concatenated to produce the final output of the BiLSTM network layer.The specific calculation process is shown in Equations ( 7)-( 9).(7) (8) (9) This study combines the advantages of TCN and BiLSTM to construct a predictive model for the active power sequence of medical equipment, with its architecture depicted in Figure 4.The forward and backward LSTM networks compute sequence information for forward and reverse input, respectively, yielding hidden state outputs of → H and ← H, which are then concatenated to produce the final output of the BiLSTM network layer.The specific calculation process is shown in Equations ( 7)- (9).
This study combines the advantages of TCN and BiLSTM to construct a predictive model for the active power sequence of medical equipment, with its architecture depicted in Figure 4.
The model first uses a stack of three TCN residual blocks to capture a broader receptive field of the input sequence and to conduct feature extraction and dimensionality reduction.To prevent gradient explosion and vanishing, each residual block maintains a consistent kernel size, denoted as k.Unlike traditional convolution, dilated convolution samples input features at specified intervals, with the sampling rate determined by the dilation factor D. The effective sampling window of dilated convolution increases exponentially with the number of convolutional layers.This approach enables the model to achieve a large receptive field while utilizing fewer convolutional layers.Consequently, the dilation factors D are set to 1, 2, and 4, respectively.The BiLSTM layer, receiving the data sequence processed by the TCN, concatenates two LSTM layers that operate in opposite temporal directions-one forward and one backward.This approach enables BiLSTM to more comprehensively capture temporal dependencies and contextual associations.Finally, a fully connected layer maps the high-dimensional features to the final prediction outcome.
the dilation factors D are set to 1, 2, and 4, respectively.The BiLSTM layer, receiving the data sequence processed by the TCN, concatenates two LSTM layers that operate in opposite temporal directions-one forward and one backward.This approach enables BiLSTM to more comprehensively capture temporal dependencies and contextual associations.Finally, a fully connected layer maps the high-dimensional features to the final prediction outcome.

SSA
The Sparrow Search Algorithm (SSA) [23], introduced by Donghua University researchers in 2020, draws inspiration from the intricate behaviors of sparrows as they forage and evade predators.The SSA delineates the sparrow population into distinct roles: discoverers, followers, and vigilantes.Each group updates its position according to a specialized formula that reflects its function.When vigilantes perceive a threat, they signal an alarm, and if the danger level crosses a predefined threshold, the discoverers lead the followers to a safer location.This process operates iteratively, maintaining a stable ratio of discoverers to followers [24,25].Simultaneously, followers, motivated by hunger, may embark on expeditions to uncharted territories in search of more energy-dense food.In addition, followers may enter a state of competition with the discoverers, who have secured the most nutritious food, thereby increasing their own foraging efficiency.
In the SSA, discoverers determine the search direction for the population, and their position updates during iterations are described by the Equations ( 10) and (11).If , it implies a safe environment, facilitating a global search.Conversely, if , it indicates the presence of danger, prompting the discoverers to adjust their positions.

SSA
The Sparrow Search Algorithm (SSA) [23], introduced by Donghua University researchers in 2020, draws inspiration from the intricate behaviors of sparrows as they forage and evade predators.The SSA delineates the sparrow population into distinct roles: discoverers, followers, and vigilantes.Each group updates its position according to a specialized formula that reflects its function.When vigilantes perceive a threat, they signal an alarm, and if the danger level crosses a predefined threshold, the discoverers lead the followers to a safer location.This process operates iteratively, maintaining a stable ratio of discoverers to followers [24,25].Simultaneously, followers, motivated by hunger, may embark on expeditions to uncharted territories in search of more energy-dense food.In addition, followers may enter a state of competition with the discoverers, who have secured the most nutritious food, thereby increasing their own foraging efficiency.
In the SSA, discoverers determine the search direction for the population, and their position updates during iterations are described by the Equations ( 10) and (11).If R 2 < s, it implies a safe environment, facilitating a global search.Conversely, if R 2 ⩾ s, it indicates the presence of danger, prompting the discoverers to adjust their positions.
In Equations ( 10) and ( 11), X i,j denotes the positional information of the sparrows, t denotes the number of iterations, α ∈ (0, 1] is a random number, n max represents the maximum number of iterations, R 2 ∈ [0, 1] denotes the threshold value, s ∈ [0.5, 1] signifies the safety value, Q is a random variable following a standard normal distribution, and L represents a 1 × d matrix.The follower's position update is described as per the Equations ( 12) and ( 13), indicating that for i > n/2, the ith follower is in a poorer state, necessitating the search for a new position.
Sensors 2024, 24, 5589 7 of 15 In Equations ( 12) and ( 13), X worst represents the current global worst position, X P denotes the position of the best explorer, and A is a 1 × d matrix with each element randomly assigned a value of 1 or −1.
When certain individuals sense danger, their positions undergo a jump, as depicted by the given Equations ( 14) and (15).If f i > f g , it indicates the individual is at the edge of the population; if f i = f g , it signifies that individuals in the middle of the population should move closer to the group to reduce risk.
In Equations ( 14) and ( 15), X t best denotes the current global optimum position, β signifies a normally distributed random number with mean 0 and variance 1, f i stands for fitness value, f g and f w represent the current best and worst fitness values, respectively, K is a random number, and ε is a nonzero constant.

Improved SSA
SSA, as a metaheuristic optimization algorithm, can be used to optimize the parameter selection of TCN-BiLSTM, reducing computation time and improving prediction accuracy.The Sparrow Search Algorithm is notably effective in tackling intricate optimization challenges.However, its reliance on a randomly generated initial population can result in an uneven distribution, which may restrict the diversity and quality of the population.This can impede the convergence rate of the algorithm [26].Moreover, SSA's performance may be further compromised by a lack of robust parameter control mechanisms, making it prone to getting trapped in local optima and experiencing reduced accuracy in subsequent iterations.
Opposition-based learning (OBL) [27] is a strategy that has gained significant recognition for its ability to enhance the diversity of populations within intelligent algorithms.By introducing the concept of opposing points and replacing random searches with adversarial searches, OBL effectively accelerates the convergence rate and strengthens the precision of solutions derived from algorithms.In this study, the fundamental concept behind utilizing the OBL strategy for population generation is to start by creating an initial random population.Subsequently, a corresponding opposing population is formed based on the initial one.The superior set from these two is then selected to be the progenitor of the next generation.The OBL strategy prioritizes individuals that are closer to the optimal solution than the initial members of the population.This strategic approach ensures that each individual begins closer to the optimal solution, thereby speeding up the convergence rate for the entire population.Moreover, OBL contributes to enhancing population diversity and fortifying the algorithm's capability to conduct a more thorough global search by exploring more promising regions.
According to previous research [28], the step size control parameters β and K are two of the most important adjustable parameters in the SSA.Appropriate step size control parameters can balance the local and global search capabilities of the algorithm, thereby reducing the number of iterations required to locate the optimal solution and improving the performance of the SSA.However, due to the fact that the step size control parameters in the SSA are all random numbers, it is not possible for the discoverer to fully explore the space, which may lead to the algorithm falling into local optima.Due to the negative correlation between step size control parameters and the algorithm population diversity, in the early stages of algorithm iteration, the population has high diversity, which means that the ability to explore global space is strong and the ability to explore local space is weak.Therefore, in this study, the value of the step control parameter is set to a smaller value at first, in order to enhance the local exploration ability.In the later stage of iteration, when all sparrows are attracted to the global optimum and there is not enough search space, the algorithm tends to converge.At this time, the step size control parameter should be set to a larger value to avoid getting stuck in local optima.Therefore, dynamically adjusting the step size control parameters in this study can balance the global and local search capabilities of the SSA, while improving model prediction accuracy and avoiding the occurrence of local optimal solutions.The improved step size control parameters β and K are shown in Equations ( 16) and ( 17): In the equation, f g and f w represent the current best and worst fitness values, n max represents the maximum number of iterations, and r ∈ [0, 1] is a random number.
In order to further reduce the possibility of the SSA falling into local optima, which would result in an inability to find the global optimal solution, the Levy flight strategy [29] is introduced to enhance the local search ability of the SSA.This enhancement effectively solves the problem of the SSA falling into local optima.The improved SSA updates the position based on the distance between the current position and the sparrow's optimal position, significantly reducing the risk of sparrows falling into local optima.Equation ( 14) is updated to Equation (18).
In the equation, d represents the spatial dimension.The Lévy calculation formula is shown in Equations ( 19) and (20).

Explanation of Experimental Data
The experimental data used in this study were obtained from the active power readings, frequency of use, operational status, spatial environmental parameters of the equipment room (temperature, humidity, and cleanliness), and the maintenance frequency of MRI and CT equipment at a hospital in East China from June to December 2023.The data consist of active power measurements, equipment parameters, and spatial environmental parameters of the equipment room gathered from the equipment rooms with MRI and CT equipment.The data was sampled every 10 min within the time windows of 08:00-16:00; 48 data were obtained every day, and a total of 10,224 data records were collected for active power prediction on 5 and 6 January 2024.In the sample of 213 d, the ratio of training sets and test sets is 7:3.The seven influence factors and active power values of historical data are taken as input vectors of the prediction model, and the output result of the model is the active power value of the forecast day.After conducting a Pearson correlation analysis between the collected data and power, data with high correlation (including equipment parameters, operational status, temperature, humidity, equipment cleanliness, and maintenance frequency) were retained.These variables also served as input parameters for the network.The data characteristics of these influencing factors are shown in Table 1.

Active power
The active power value of the medical equipment at the sampling time point, expressed in KW.

Frequency of use
Record of the frequency of the use of medical equipment, expressed as number of times.

Operational status
Record of the current operating state of the equipment.MRI equipment and CT equipment are divided into standby state and running state.

Temperature
The temperature change in the equipment room affects the stability of medical devices.

Humidity
The humidity change in the equipment room affects the stability of medical devices.

Equipment cleanliness
The cleanliness index of medical equipment is evaluated by scoring.

Maintenance frequency
Record the maintenance times of the medical equipment.Maintenance of medical equipment can improve the operation stability.

Equipment parameters
Includes models of medical equipment (MRI and CT) and power levels.

Anomaly Data Processing
During the collection of historical meteorological data and active power data, storage device faults and network signal transmission fluctuations may affect the sample data, resulting in outliers and missing values.In order to improve the effectiveness and availability of the collected data and realize the accurate prediction of the active power of medical equipment, the data collected in this period is preprocessed every 2 h.To enhance the model's predictive accuracy, this study utilizes the KNN method to impute missing data and applies the pauta criterion to eliminate outlier data.As shown in the Equation ( 21), KNN first calculates the Euclidian distance d i (X i , X j ) between the target active power data (data records with missing items) and all fully valued data records in the data set.
In the equation, X i = {x i1 , x i2 , • • • , x im } represents the first m-dimensional data of the ith sample, and x ir denotes the rth dimensional attribute of the ith sample.Secondly, the k data records with the smallest Euclidian distance from the target data are selected as the nearest neighbors of the target data.As shown in the formula, the weight of the nearest neighbor of the target data is as follows: Finally, the weighted average value of the corresponding position value recorded by k's nearest neighbor data is the estimated value of the missing active power data: In the equation, w i represents the value of the nearest neighbor corresponding position.
Sensors 2024, 24, 5589 10 of 15 The equation for pauta criterion is as follows: In the equation, ∆x i = x i − x i .If the condition is met, x i is identified as an outlier and will be replaced using the K-nearest neighbors method.

Data Normalization
To expedite model training, the raw sequential data is normalized to compress the value range of all features between zero and one, maintaining the variance and distribution while preventing excessively large values from disproportionately impacting the model.The equation of data normalization is as follows: In the equation, x ′ ∈ [0, 1] denotes the normalized data, and x min and x max represent the minimum and maximum values in the dataset, respectively.

Model Evaluation Metrics
To assess the predictive performance of the model, the study used the following metrics: mean squared error (MSE), mean absolute error (MAE), R-squared (R 2 ), and run time (RT).Lower values of MSE and MAE, along with a higher R 2 , signify enhanced predictive accuracy.While upholding predictive accuracy, a reduced RT signifies superior model performance.The equations for the respective evaluation metrics are as follows: In the equations, denotes the dimension of the power series in the test set, represents the actual power, is the predicted power, and is the mean of the actual power values.

Experiment Platform and Training Hyperparameters
In this study, the improved model is used on a deep learning server with the configuration shown in Table 2.
In the equations, denotes the dimension of the power series in the test set, represents the actual power, is the predicted power, and is the mean of the actual power values.

Experiment Platform and Training Hyperparameters
In this study, the improved model is used on a deep learning server with the configuration shown in Table 2.
In the equations, denotes the dimension of the power series in the test set, represents the actual power, is the predicted power, and is the mean of the actual power values.

Experiment Platform and Training Hyperparameters
In this study, the improved model is used on a deep learning server with the configuration shown in Table 2.

Configuration
Parameter CPU Inter(R) Xeon(R) W-2223 (Inter, Santa Clara, CA, USA) GPU Nvidia GeForce RTX 2080ti (Nvidia, Santa Clara, CA, USA) In the equations, n denotes the dimension of the power series in the test set, P i represents the actual power, In the equations, denotes the dimension of the power series in the test set, represents the actual power, is the predicted power, and is the mean of the actual power values.

Experiment Platform and Training Hyperparameters
In this study, the improved model is used on a deep learning server with the configuration shown in Table 2.
P i is the predicted power, and P is the mean of the actual power values.

Experiment Platform and Training Hyperparameters
In this study, the improved model is used on a deep learning server with the configuration shown in Table 2. Some of the training hyperparameters of the ISSA-TCN-BiLSTM model are as follows: The convolution kernel size is 3; the expansion coefficient D is 1,2,4; the dropout probability in TCN is 0.1; the number of BiLSTM layers is 2; and the total iteration number is 300.The selection of ISSA parameters is based on a large number of experiments and references, and these values provide the best evaluation indicators and the best computational efficiency on the training data set.The training hyperparameters of the improved SSA are shown in Table 3.

Model Comparison Analysis
To verify the predictive performance of the proposed model, this study compares it with various other forecasting models.ISSA-TCN-BiLSTM is compared against models such as the Gated Recurrent Unit (GRU) [30], LSTM networks, and TCN-BiLSTM forecasting models.In order to verify the predictive performance of the model for active power of different medical equipment, 5 January 2024, was selected as the prediction day to predict the active power of MRI and CT equipment, as shown in Figure 5. Figure 5 displays the prediction curves of active power for MRI and CT equipment using four different models.As the medical equipment transitions from standby to startup, there is a slight increase and fluctuation in active power.When the medical equipment enters operating mode, the active power suddenly increases.Meanwhile, Figure 6 illustrates the prediction errors corresponding to different models.The gating mechanism in LSTM and the GRU allows the model to selectively receive or ignore input information; thus, it still has a good prediction effect during the time period when the operating state of the equipment changes, and then the change of active power data shows an inflection point.The GRU and LSTM cannot effectively extract the variation characteristics of the active power of the equipment, as the operating state of the equipment changes frequently.It can be seen from Figure 6 that the ISSA-TCN-BiLSTM proposed in this paper has the smallest error curve fluctuation compared to other models.This is because the active power of medical equipment is influenced by various key factors, such as operating conditions, computer room environment, usage frequency, and maintenance frequency.Preprocessing the dataset before prediction can optimize the input samples.The TCN-BiLSTM combination model can merge the advantages of different algorithms to effectively learn long-term dependencies between data while extracting feature information with a larger receptive field range, reducing the risk of overfitting.In addition, using the ISSA to optimize the parameter selection required for the TCN-BiLSTM model can improve the search speed of the model, overcome the blindness and limitations of the traditional LSTM model in parameter selection, and thus improve prediction accuracy.
To analyze the differences among the models, a comparative analysis of RMSE, MAE, R2, and RT for the four models was undertaken (Model 1 represents LSTM; Model 2 represents GRU; Model 3 represents TCN-BiLSTM; Model 4 represents ISSA-TCN-BiLSTM), the results are shown in Table 4.The ISSA-TCN-BiLSTM model achieved an RMSE of 0.1143 and an MAE of 0.1157 on MRI equipment, and an RMSE of 0.1157 and MAE of 0.0817 on CT equipment.The improved model attained R2 values of 0.95 and 0.96 on different equipment.The proposed method significantly outperforms mainstream forecasting models in evaluation metrics, especially with a runtime of only 3.9864 s and 3.8712 s, indicating high real-time performance.
combination model can merge the advantages of different algorithms to effectively learn long-term dependencies between data while extracting feature information with a larger receptive field range, reducing the risk of overfitting.In addition, using the ISSA to optimize the parameter selection required for the TCN-BiLSTM model can improve the search speed of the model, overcome the blindness and limitations of the traditional LSTM model in parameter selection, and thus improve prediction accuracy.To analyze the differences among the models, a comparative analysis of RMSE, MAE, R2, and RT for the four models was undertaken (Model 1 represents LSTM; Model 2 represents GRU; Model 3 represents TCN-BiLSTM; Model 4 represents ISSA-TCN-BiLSTM), the results are shown in Table 4.The ISSA-TCN-BiLSTM model achieved an RMSE of 0.1143 and an MAE of 0.1157 on MRI equipment, and an RMSE of 0.1157 and MAE of 0.0817 on CT equipment.The improved model attained R2 values of 0.95 and 0.96 on different equipment.The proposed method significantly outperforms mainstream forecasting models in evaluation metrics, especially with a runtime of only 3.9864 s and 3.8712 s, indicating high real-time performance.
The experimental results demonstrate that the model proposed in this article exhibits robust learning capabilities when faced with abrupt changes in operating state samples.Additionally, it achieves superior generalization performance, thereby minimizing operational errors during equipment operating state transitions.Notably, ISSA-TCN-BiLSTM demonstrates excellent predictive performance across various medical equipment types and serves as an effective method for forecasting the active power of medical equipment.Consequently, it can promptly and precisely predict the operating status of medical equipment.The experimental results demonstrate that the model proposed in this article exhibits robust learning capabilities when faced with abrupt changes in operating state samples.Additionally, it achieves superior generalization performance, thereby minimizing operational errors during equipment operating state transitions.Notably, ISSA-TCN-BiLSTM demonstrates excellent predictive performance across various medical equipment types and serves as an effective method for forecasting the active power of medical equipment.Consequently, it can promptly and precisely predict the operating status of medical equipment.

Discussion
In this study, a predictive model for active power of medical equipment is proposed based on an improved Sparrow Search Algorithm (ISSA), Temporal Convolutional Network (TCN), and Bi-directional long short-term memory network (BiLSTM).The experimental results demonstrate that this model outperforms GRU, LSTM, and other existing methods in terms of accuracy and error in active power prediction.The model addresses issues related to missing data and poor robustness when the operating state of medical equipment changes, and it can meet the prediction requirements for active power of various medical equipment.
The ISSA-TCN-BiLSTM model extracts and reduces the dimension of active power features by combining TCN and BiLSTM to prevent gradient explosion in active power prediction for medical equipment, which can hinder model convergence.By adjusting the dilation factor D, TCN can achieve a larger input time series receptive field with fewer convolutional layers, reducing calculation time and enhancing prediction accuracy.The BiLSTM model links positive and negative LSTM layers to better capture context information from time series data and improve the model's ability to identify active power features.Additionally, ISSA is utilized to optimize the hyperparameters of the TCN-BiLSTM model, aiming to enhance the hyperparameters for active power prediction tasks across various medical equipment.
The experimental results demonstrate that the model proposed in this article exhibits robust learning capabilities when faced with abrupt changes in operating state samples.Additionally, it achieves superior generalization performance, thereby minimizing operational errors during equipment operating state transitions.Notably, ISSA-TCN-BiLSTM demonstrates excellent predictive performance across various medical equipment types and serves as an effective method for forecasting the active power of medical equipment.Consequently, it can promptly and precisely predict the operating status of medical equipment.The ISSA-TCN-BiLSTM model strikes a good balance between the stability and accuracy of prediction, making it an effective method for predicting the active power of medical equipment.
Although ISSA-TCN-BiLSTM shows strong performance, it also has certain limitations.The diversity and size of the dataset may affect the model's generalization ability.Due to the lack of current medical equipment active power datasets, the model proposed in this paper has not been validated on external datasets.Therefore, in future work, we will incorporate transfer learning and multimodal technology in the application of other types of medical equipment for experimentation, exploration, and advancement.Simultaneously, we will conduct more comparisons between different models to facilitate a more in-depth analysis and discussion of the model's generalization performance.

Summary
In order to accurately monitor the operational status of medical equipment, this paper proposes a medical equipment active power prediction model based on ISSA-TCN-BiLSTM.The main conclusions are as follows: (1) Through Pearson correlation analysis, abnormal data processing, and normalization, the original sample set is preprocessed to extract useful features related to active power changes.This process enhances the correlation between input features, addressing the limitations of low input quality and high information redundancy in traditional prediction models.(2) The use of TCN to extract and reduce key input features has improved the accuracy of BiLSTM.In addition, by improving the global search and local search capabilities of the SSA, the parameter selection of TCN-BiLSTM is optimized, aiming to avoid blind parameter settings and getting stuck in local optimal solutions.Verified by actual measurement data, the model proposed in this article can effectively improve prediction accuracy, thereby achieving reliable prediction of active power for different medical equipment.
(3) When using different evaluation criteria to measure model performance, some potential influencing factors are sometimes ignored.Therefore, in the final evaluation step of the model prediction performance, we selected the three evaluation criteria of RMSE, MAE, and R2 to evaluate the accuracy of the model.At the same time, RT was used to evaluate the complexity of the model by the speed of the model response.
Through comparative experiments, when the ISSA-TCN-BiLSTM model proposed in this paper is applied to MRI and CT equipment, its RMSE, MAE, and R2 values show that its prediction performance is better than LSTM, GRU, and TCN-BiLSTM models, while the RT index is slightly inferior to LSTM and GRU, but in a large data set, this impact is small.Combining these four evaluation criteria helps to obtain a more objective and effective evaluation of the model proposed in this paper.
The ISSA-TCN-BiLSTM model proposed in this paper improves the accuracy of time series prediction of medical equipment monitoring points and provides a theoretical basis for decision-making on the preventive maintenance cycle of medical equipment and reducing maintenance costs.In future work, we will conduct an in-depth study of largescale medical equipment active power prediction tasks.In addition, we will combine machine learning, deep learning, multi-modality, and transfer learning and build different medical equipment active power datasets to solve various medical equipment power prediction tasks.

Figure 3 .
Figure 3.The structure of the BiLSTM network.

Figure 3 .
Figure 3.The structure of the BiLSTM network.

Figure 6 .
Figure 6.Comparisons of prediction errors among different models.
In the equation, i t , f t and o t respectively denote the input gate, forget gate and output gate, and x t represents the input at time t.The sigmoid activation function is represented by σ(•), and tanh(•) represents the hyperbolic tangent activation function.The weight coefficients of the output gate are denoted by w ox , w oh , w oc , and C t and

Table 2 .
Deep learning server configuration.
8 and torch 1.13.1 Some of the training hyperparameters of the ISSA-TCN-BiLSTM model are as follows: The convolution kernel size is 3; the expansion coefficient D is 1,2,4; the dropout probability in TCN is 0.1; the number of BiLSTM layers is 2; and the total iteration number

Table 2 .
Deep learning server configuration.

Table 2 .
Deep learning server configuration.

Table 2 .
Deep learning server configuration.

Table 4 .
Analysis of model forecast on 3 January 2024.

Table 4 .
Analysis of model forecast on 3 January 2024.