ABCNet: A comprehensive highway visibility prediction model based on attention, Bi-LSTM and CNN

: Meteorological disasters along highways significantly reduce road traffic efficiency. Low visibility caused by heavy fog is a severe meteorological disaster that greatly increases highway traffic accidents. Accurately predicting highway visibility and taking timely response measures can reduce the impact of meteorological disasters and improve traffic safety. We proposed an Attention-based BiLSTM-CNN (ABCNet) model, which synergized attention mechanisms with BiLSTM and CNN technologies to forecast atmospheric visibility more accurately. First, the Bi-LSTM module processed information both forward and backward, capturing intricate temporal dependencies in the model. Second, the multi-head attention mechanism following the Bi-LSTM distilled and prioritized salient features from multiple aspects of the sequence data. Third, the CNN module recognized local spatial features, and a singular attention mechanism refined the feature map after the CNN module, further enhancing the model’s accuracy and predictive capability. Experiments showed that the model was accurate, effective, and significantly advanced compared to conventional models. It could fully extract the spatiotemporal characteristics of meteorological elements. The model was integrated into practical systems with positive results. Additionally, this study provides a self-collected meteorological dataset for highways in high-altitude mountainous areas.


Introduction
The critical importance of road safety has become increasingly pronounced with the ongoing advancements in transportation vehicles and infrastructure development [1][2][3].Among the numerous factors compromising road safety, meteorological disasters such as heavy fog, haze, rain, snow, etc. along highways stand out as a significant impediment to traffic efficiency [4,5].Yunnan Province in China, characterized by its mountainous terrain and frequent low visibility conditions, epitomizes the challenges posed by meteorological disasters on highways.The province's extensive road network, particularly in higher altitude areas, is vulnerable to low visibility conditions, underscoring the need for accurate visibility predictions to facilitate effective traffic management and safety measures.
Figure 1 shows actual scene photos from the Yunnan Province Plateau Mountain Area Traffic Meteorological Database collected during this study, demonstrating drastic visibility changes (within 36 minutes) at the same spatial location.In this study, we focused on improving highway visibility prediction through the analysis of meteorological and related data using advanced computational models and algorithms.
Traditional visibility prediction methods, such as statistical and regression analysis, suffer from limitations including low accuracy, lengthy computation times, and complexity, hindering their practical applicability for ensuring swift traffic flow on highways.
In response to these limitations, this research harnesses the potential of deep learning, which has shown remarkable success in fields like computer vision and natural language processing, to advance the state of road visibility prediction.By leveraging neural networks' adaptability and non-linear mapping capabilities, deep learning-based predictions offer a promising approach to achieving accurate and reliable visibility forecasts.However, the application of deep learning in this domain is not without challenges.Current techniques predominantly rely on image data samples, which demand high-quality data collection standards and are susceptible to various non-meteorological factors that can impair data integrity.
We propose ABCNet, an attention-based BiLSTM-CNN framework for highway visibility prediction using multi-dimensional non-image data.The contributions are as follows: • A meteorological dataset is constructed from specialized equipment on highways in the plateau mountain areas of Yunnan Province, China, covering various weather conditions and visibility levels.
• A hybrid model is designed that combines the advantages of BiLSTM and CNN to extract temporal and spatial features from non-image data, utilizing an attention mechanism to enhance the representation of features most relevant for visibility prediction.
• Extensive experiments are conducted to evaluate the model's performance, comparing it with several state-of-the-art deep learning baseline methods.The results indicate superior accuracy, robustness, and generalization of the model.
• The practical value of the model for highway management is demonstrated by providing visibility predictions and alerts for drivers and operators, significantly reducing the probability of traffic accidents.
The contribution is not merely the amalgamation of techniques but their novel application and the creation of a unique dataset.Its uniqueness lies in the synergistic integration of these methods with the newly developed plateau mountain area traffic meteorological dataset.This comprehensive approach collectively enhances the model's predictive accuracy and applicability across varied geographic regions.
The rest of this paper is organized as follows.Section 2 reviews current advancements in highway visibility prediction, encapsulating the related work.Section 3 presents both utilized and self-developed datasets, underscoring the contribution to the field.Section 4 delves into the methodology, detailing the ABCNet model proposed in this paper.Section 5 is dedicated to experiments and analysis, evaluating and comparing the model's performance and analyzing the results.Finally, Section 6 concludes the paper, summarizing the major contributions and looking forward to future research directions.

Related work
Accurately predicting low visibility scenarios is crucial for operational safety at airports and along coastlines.Visibility prediction currently relies heavily on numerical forecasting methods similar to weather prediction.Zhang et al. [6] introduced a multimodal fusion technique to construct a weather visibility prediction system.Cornejo-Bueno et al. [7] investigated the persistence and prediction of low-visibility events at Villanubla Airport in Spain, particularly during winter months.They studied the Runway Visual Range (RVR) time series and evaluated short-term visibility persistence using Markov chain analysis.
Kamangir et al. [8] proposed a deep learning framework with an attention mechanism for visibility prediction, achieving state-of-the-art accuracy (68.9%) on runway visual range prediction within a custom dataset collected at airport observation stations.Liu et al. [9] developed both a polynomial regression model and a deep neural network (DNN) model for visibility prediction.Yu et al. [10] focused on applying a machine-learning-based fusion model to visibility forecasting in Shanghai, China.They introduced a boosting-based fusion model (BFM) and compared it to other prediction models, including LightGBM based on multisource data (LGBM) and RAEMS.
Zang et al. [11] developed a recursive neural network (RNN) prediction model named SwiftRNN, which outperformed ConvLSTM and PredRNN models regarding skill scores for visibility prediction.Ortega et al. [12] reviewed a variety of deep learning models that have been used for visibility prediction, and they compared the performance of these models on a variety of datasets.Peláez-Rodríguez et al. [13] proposed a novel ensemble model for atmospheric visibility forecasting.The proposed model is based on the combination of machine learning models and numerical weather prediction (NWP) data.
Kim et al. [14] aimed to enhance visibility forecasts by establishing an automatic visibility observation network composed of 291 forward-scattering sensors in South Korea.Data assimilation improved prediction skills, particularly within a nine-hour forecast window and for extremely low-visibility events.Qian et al. [15] investigated the application of anomaly-based weather analysis to predict low visibility associated with coastal fog at Ningbo Zhoushan port in East China.Fernández-González et al. [16] studied forecasting of poor visibility episodes near Tenerife Norte airport, testing various methods for estimating visibility based on mesoscale model outputs and presented an application for real-time monitoring of weather conditions to assess poor-visibility risk.Pahlavan et al. [17] conducted a numerical prediction study of radiation and CBL fog events over Iran using the WRF model with different model configurations and visibility prediction as a key focus.
Egli et al. [18] combined data and video quantifying images to analyze large fog evolution trends quantitatively.Kim et al. [19] predicted visibility in South Korea (VISRF) using a random forest (RF) model based on ground observation data from the Automated Synoptic Observing System (ASOS) and air pollutant data from the European Centre for Medium-Range Weather Forecasts (ECMWF) Copernicus Atmosphere Monitoring Service (CAMS) model.Their method exhibited a smaller bias below 2 km compared to other visibility parameterization schemes.
Moreover, Wen et al. [20] compared the performance of five common machine learning methods under various training parameter schemes, including XGBoost, LightGBM, Random Forest (RF), Support Vector Machine (SVM) and Multiple Linear Regression (MLR) using long-term measured data.Noteworthy research contributions are also made by Zhen et al. [21] and Shi et al. [22].
In summary, the related research works reveal the following shortcomings: • Limited Geographic Applicability: Many existing studies rely on datasets from airports and ports, which primarily reflect conditions in plain areas.This specialization leaves a significant gap in research for highway visibility prediction across diverse topographies, such as mountainous regions, where environmental conditions drastically differ.
• Inadequate Evaluation Across Conditions: Many studies do not comprehensively evaluate their methods under a variety of conditions, leading to a limited understanding of their performance and robustness.The absence of rigorous, diverse-condition testing restricts the proven applicability of these methods, highlighting the importance of extensive validation efforts to ensure reliability and effectiveness in real-world scenarios.
• Limited Adaptability and Generalization: Existing methods often lack adaptability to diverse environmental conditions, making them less effective for wide-ranging geographic and meteorological variations.This shortfall is particularly evident in models optimized for specific climates or regions, which may not perform well under different conditions, underscoring the need for more versatile and generalizable approaches.
• Despite the availability of datasets from structured environments like airports and ports, a critical gap exists in validating prediction models within operational highway systems.It is essential to ascertain real-world efficacy, as it involves ground truth data from diverse and dynamic traffic conditions, particularly in complex terrains.
In this paper, a deep learning model combined with multiple dimensions of meteorological data is leveraged to create a comprehensive approach for highway visibility prediction.Specifically, an Attention-based BiLSTM-CNN network (ABCNet) is proposed, a cutting-edge prediction model designed to provide accurate and timely short-term visibility forecasts on highways.The proposed method has the following advantages: • The method significantly enhances geographic applicability by leveraging a uniquely developed plateau mountain area traffic meteorological dataset.This dataset empowers the model to effectively predict visibility in diverse topographies, particularly in mountainous regions where conditions vary significantly.
• The model exhibits strong universality.The model has also been compared and tested on the self-built and public datasets.The results show that the model is universally applicable and suitable for various visibility prediction scenarios.
• The prediction accuracy has been validated with accurate application data.The model has been integrated into the actual "Highway Traffic Meteorological Intelligent Monitoring and Proactive Control System" for the past four years.Frontline users have validated its accuracy.This is specifically detailed in Section 5.7.

Data collection
The research is supported by the National Engineering Laboratory for Surface Transportation Weather Impacts Prevention.This laboratory is a leading institution in China focusing on transport meteorology, particularly in the field of highway meteorology.Figure 2 illustrates the meteorological data collection device used by the team, which is pivotal for the dataset creation in this study.
The Mazhao Highway in Yunnan Province, China, is a typical mountain highway with complex climate and frequent sudden visibility shifts, making it highly suitable for the visibility research.In accordance with the technical standards for the construction of highway meteorological station networks [23], 17 multi-element traffic meteorological stations were installed on the Mazhao Highway by the end of 2021.Each data record collected by the meteorological equipment includes 15 meteorological elements such as visibility, wind speed, temperature, air pressure, humidity, wind direction, precipitation, road surface temperature, and road conditions.Figures 3 and 4 show the geographical locations of the 17 meteorological stations and examples of the raw meteorological data, respectively.
Based on the data from these 17 meteorological stations, a highway meteorological element dataset named WD-17 [24] was constructed for the visibility prediction task.This dataset contains 8,348,575 entries of multidimensional, high-precision, high-integrity, and high-quality meteorological sample data (one entry per minute) collected from March 2022 to July 2023.This is believed to be the first public minute-level meteorological dataset in the world for highways in high-altitude mountainous areas.It has multi-dimension meteorological data on dramatically changing visibility that is crucial for building high-performance visibility prediction models.

Dataset preprocessing
The meteorological elements collected by the weather stations along the highways include nearly twenty different factors such as temperature, rainfall, visibility, humidity, road surface temperature, road conditions, wind speed, wind direction, air pressure, subgrade temperature, water film thickness, freezing point temperature, ice layer thickness, snow layer thickness, and slipperiness coefficients.Some of these elements may be irrelevant or redundant for visibility prediction, contributing to limitations in model training and predictive performance.In fact, only about five elements might be strongly correlated with visibility prediction.Therefore, the cosine similarity method is used to filter out low-correlation features, reducing data dimensionality, and enhancing the model's predictive performance.The calculation of cosine similarity is shown in Eq (1).
herein, A and B represent two different meteorological factors.The higher the value of Similarity (A, B), the more relevant the two features are.After removing redundant features with extremely high correlation coefficients (greater than 0.9), six meteorological elements (visibility, wind speed, temperature, humidity, precipitation and road surface temperature) were retained for the research.The dataset format is shown in Table 1.
Table 1.Format of dataset WD-17 after feature dimensionality reduction.

Other comparative experimental datasets
In the experiments, to demonstrate the universality of the model in visibility prediction tasks, a public visibility dataset WD-Vigo [25] from the Vigo Airport weather station in Spain was also used, spanning 2008 to 2020.This dataset records meteorological elements every 30 minutes, comprising a total of 219,439 data entries, including visibility, temperature, humidity, wind direction, wind speed and air pressure.The format of the dataset is shown in Table 2.

Methods
Drawing upon the advancements in sequence data processing across domains such as speech recognition and natural language processing, this study introduces the ABCNet model, a novel framework designed for the prediction of highway visibility.The innovation of ABCNet stems from its integrative use of Bi-directional Long Short-Term Memory (Bi-LSTM), Convolutional Neural Network (CNN), and attention mechanisms, offering a comprehensive approach to capturing the spatiotemporal dynamics of meteorological factors impacting visibility.Specifically, the Bi-LSTM module is employed to analyze temporal sequences in both forward and reverse directions, thereby unraveling complex temporal patterns within the data.This temporal analysis is enhanced by a multi-head attention mechanism, situated subsequent to the Bi-LSTM layer, which serves to isolate and amplify critical features from the temporal data, ensuring a focused analysis on the most influential elements.Moreover, the incorporation of a CNN module aids in the extraction of spatial features.This integration, coupled with a singular attention mechanism, optimizes the refinement of the feature map, significantly bolstering the model's predictive accuracy and capability.The structural design of ABCNet, including its sequential integration of these computational components, is delineated in Figure 5. Integrating Bi-LSTM with multi-head attention mechanisms, the model enhances feature capture and representation precision.Following Bi-LSTM, the multi-head attention offers: • Enhanced Representation: It enables simultaneous focus on diverse aspects of data, ensuring a more comprehensive information capture.
• Robustness: Parallel processing improves the model's resilience against scene variations or noise.
• Adaptive Learning: Automatically adjusts feature weights, aiding in the precise identification of important sequence parts.
• Long-Distance Dependency Recognition: This focuses on various sequence positions, effectively recognizing long-range dependencies.
Furthermore, combining Bi-LSTM and CNN exploits each method's strengths, offering: • Spatiotemporal Feature Capture: Bi-LSTM manages temporal dependencies; CNN excels in spatial pattern recognition.Together, they reveal complex spatiotemporal relationships.
• Multilevel Feature Extraction: This combination enhances data understanding and representation through a layered analysis approach.
• Multiscale Optimization: Improves handling of information across different scales, boosting adaptability.These components and their synergistic benefits underpin the ABCNet model's framework, detailed further in subsequent Sections.

Problem formulation
In this research, time series data of meteorological factors pertinent to highway conditions are analyzed, and collected via dedicated weather monitoring devices.The dataset is organized into segments based on a predefined temporal window of length n , formulated as where each ) signifies a vector of meteorological variables (e.g., temperature, precipitation, wind speed, humidity, and road surface conditions) corresponding to sequential time points.The Bi-LSTM model processes these segments through an input matrix X N D   , where N and D respectively denote the temporal window's size and the dimensionality of each vector within that window.The Bi-LSTM's output, represented as i h , encapsulates a complex feature representation of the time series data, further refined by a multi-head attention mechanism to accentuate significant features.Subsequently, the CNN module, enhanced by a singular attention mechanism, integrates these features to deduce the final model output.The primary objective is forecasting future road visibility, utilizing a fixed-length series of meteorological data S .

Bi-LSTM module
Visibility on highways is forecasted using historical data, based on the rationale that past patterns provide insights into future conditions.This approach naturally implies that short-term forecasts are often more precise than long-term ones due to the expected continuity of recent trends.The analysis of visibility trends benefits significantly from examining time series data both forward and backward.While conventional LSTM models may struggle to capture the nuanced features of these trends, Bi-LSTM stands out by assimilating bidirectional time series data, thus enriching the model's informational base.By incorporating two LSTM layers, Bi-LSTM overcomes the limitations of traditional models, which predominantly rely on historical data for future predictions.It effectively utilizes both preceding and succeeding data points, facilitating a comprehensive analysis of temporal patterns.Bi-LSTM includes forward and backward models for processing meteorological data, offering a holistic approach to data analysis.Figure 6 illustrates a standard Bi-LSTM model configuration, showcasing its ability to leverage bidirectional data for enhanced predictive accuracy.
Bi-LSTM's forward and backward outputs are connected using Eq (2) for further processing., where t h  and t h  represent the outputs of the forward and backward LSTM, respectively.‖denotes the concatenation operation, and L is the size of each LSTM.
The Bi-LSTM layer is used to obtain the representation of the meteorological element time series.The word embedding vector 1 2 { , ,..., } n S s s s  of the series is first obtained, then apply the Bi-LSTM layer to S to obtain the forward vector sequence as in Eq (3), the backward vector sequence as in Eq (4) and concatenate these two sequences to get Eq (5).
, , , where L is the fundamental features of S and T , l represents the feature of each sequence in the sequence set.As the output vector of Bi-LSTM, i l will be input into the CNN module through a multi-head attention mechanism.

Attention and multi-head attention
The attention mechanism, as outlined in reference [26], assigns a score to each dimension of the input data, subsequently weighting the features based on these scores to accentuate the critical features.This process enables the mechanism to significantly influence downstream models or modules by prioritizing information that is deemed most relevant for the task at hand.The operational details of the attention mechanism are encapsulated by Eq (6).
( , , ) ( ) where Q represents the query vector, K is the key vector, and V is the value vector.k d represents the dimension of V , superscript T in the equation represents the transpose of K. Q, K, and V are all weight matrices, initially randomized and then optimized during the gradient descent process of training data.
The attention mechanism simplifies to mapping input elements into vectors within a matrix X , followed by calculating attention weights using vectors Q , K and V for a weighted summation output.Enhanced by a multi-head mechanism from the Transformer model [27], which operates in parallel subspaces for richer feature processing, this approach significantly boosts feature capture capabilities.The ABCNet model incorporates multi-head attention to better understand complex data dependencies.The output of the Bi-LSTM is directed into a multi-head attention layer, enhancing performance by focusing on varied feature levels.The structure of multi-head attention is detailed in Figure 7.The multi-head attention layer is shown Eqs ( 7) and ( 8).
( , , ) where , , Q K V are transformed from the output of Bi-LSTM, head l represents the attention score calculated by the l-th head in multi-head attention.Concat represents the concatenation operation., , , W W W W are several different weight matrices, which are initially randomized and then automatically adjusted throughout the training process.

CNN module
While CNNs are commonly associated with image data processing, ABCNet distinctively employs one-dimensional convolution (Conv1D) for the analysis of time series data.This method demonstrates superior precision compared to traditional linear regression models.Conv1D is particularly effective for processing various forms of sequence data, including text, audio, and time series, by taking a two-dimensional tensor as input.This tensor's first dimension accounts for time steps, while the second dimension captures the feature dimensions associated with each time step.In a manner analogous to Conv2D within CNN architectures, Conv1D operates according to the equation: where (i) y represents the i-th element of the output sequence, b signifies the bias term, W(k) is the weight term, and X(i+k) denotes the input sequence, with k indicating the position of the convolution window.The convolution window is slid over the input sequence to generate the output sequence, adapting to various applications such as analyzing meteorological data series.This sliding process overlays the convolution window onto the input sequence, computing the values within the window to produce the sequence's output.

Output
The ABCNet model produces a forecasted time series output succinctly expressed as  ˆˆ{ , , , } Here, ˆt k y  represents the forecasted value for a future time point t + k, with m denoting the number of steps ahead from the current time t.This format clearly lays out the model's predictions in a sequential manner, spanning from the near future up to the specified forecasting horizon.Each series element marks a distinct future moment, providing a detailed projection of anticipated conditions throughout the period in question.
In its final stage, the model synthesizes the deep learning processes' collective outcomes, integrating temporal patterns identified by the Bi-LSTM module with spatial characteristics discerned by the CNN module, all finely tuned via an attention mechanism to prioritize significance and precision.This culmination showcases the model's adeptness at merging sophisticated feature extraction with sequential data analysis, highlighting its comprehensive approach to forecasting.

Dataset division
In the experiments, the WD-17 dataset was divided into a training set and a test set in an 8:2 ratio, with the training set comprising 6,678,860 entries and the test set 1,669,715 entries.Similarly, the comparative experimental dataset WD-Vigo was divided into a training set (175,551 entries) and a test set (43,888 entries) in an 8:2 ratio.

Experimental environment configuration and model parameter settings
The experimental environment operates on an Ubuntu 16.04 LTS system.The CPU used is an Intel I7-13700H, complemented by 128 GB of memory and a 4-TB hard disk.The GPU is an NVIDIA GeForce GTX 4060Ti.The programming language employed is Python, with TensorFlow 2.12.0 serving as the machine learning software development library.
The relevant parameters of the ABCNet model utilized in this article are presented in Table 3.

Evaluation metric
In this study, common evaluation metrics is used for time series forecasting tasks: Mean Squared Error Loss (MSE), Mean Absolute Error Loss (MAE) and Mean Absolute Percentage Error (MAPE).The definitions of these metrics are as follows: • MSE: The average of the absolute squared errors between predicted values and actual values.It measures the gap between predictions and reality.A smaller MSE indicates a better model.
• MAE: Measures the average absolute error between predicted values and actual values.It's a non-negative value; a smaller MAE indicates a better model.
• MAPE: A normalized version of MAE, this metric is sensitive to relative errors and does not change with the global scaling of the target variable, making it suitable for problems with large dimensional differences in the target variable.A smaller MAPE indicates a better model.

MSE y y y y n
where samples n represents the number of samples, i y is the actual values, i y  is the predicted values.

Performance comparison with different window sizes and time steps
For time series forecasting tasks, the size of the time window and the length of the forecasting time steps are important parameters.The size of the time window represents the length of historical data used, and the forecasting time step determines the output length of the model.To explore the performance of the ABCNet model in short-term road visibility forecasting with various combinations of time window sizes and forecasting time steps, the time window sizes were set to 30 minutes, 1, 2, 3, 6 and 10 hours.The forecasting time steps were set to 15 and 30 minutes, 1 and 2 hours.The experiments on the WD-17 dataset evaluated the model's performance, using MSE and MAE metrics, across these varying configurations.The results are shown in Tables 4 and 5.
Typically, for time series forecasting, the strategy is to predict in a single step, describing the prediction of the next time step's observation.The tables show that for multi-step prediction methods, both the size of the window and the length of the forecast time step affect the model's performance.With a fixed window size, the model's prediction error increases with the lengthening of the forecasting time step.Based on the experimental results and considering the practical application scenarios in highway visibility forecasting (where very short forecasting time steps, such as 15 minutes, are not significantly useful for highway management), a time window size of 3 hours and a time step length of 30 minutes were chosen as the optimal parameters for ABCNet.This parameter combination provides the best results in application scenarios.The loss curve reflects the changes in the loss function during the model training process.It is a function used to evaluate the error in the model's predictive results.The model parameters can be adjusted based on changes in the loss function to make the model's predictions more accurate.With a window size of 3 hours, the loss curves of the model on the WD-17 dataset for each forecast time step length on both the training and test sets were recorded, as shown in Figure 8.The figure shows that the loss curve steadily decreases, and the error in the model's predictive results continually diminishes until it stabilizes, indicating that the model is effectively capturing the patterns in the training data.

Ablation experiment
To validate the rationality and superiority of the proposed model architecture and to better demonstrate the contribution of different modules in the ABCNet architecture, an ablation experiment on the WD-17 dataset was conducted.The CNN and Attention modules from ABCNet were removed to compare changes in model performance.Based on the experimental results in Section 5.3, the time window was set to 3 hours and the forecast time step to 30 minutes.Table 6 shows the performance of the ablation experiments for each module of the model.The ablation study results indicate that each additional component improves the model's performance, with the combined model (Bi-LSTM+CNN+Attention) achieving the best performance across all metrics, suggesting that the integration of CNN and Attention mechanisms with the Bi-LSTM model significantly enhances the model's prediction accuracy.The lowest MSE (0.0032), MAE (0.031) and MAPE (0.0422) scores for the combined model indicate higher precision and reliability of the results compared to the other variants.This suggests that the additional components contribute positively to the model's ability to capture and utilize relevant patterns in the data.
In conclusion, each component added to the model likely helps capture different types of patterns within the data, and their combination provides a more complete and nuanced understanding of the input features.This is reflected in the lower error rates across all three metrics, indicating a model that generalizes better to new data and provides more accurate predictions.

Performance comparison with competitive models
To validate the performance of proposed ABCNet model in highway visibility prediction tasks, experiments on the WD-17 and WD-Vigo datasets were conducted using several of the most competitive time series forecasting methods, comparing them with ABCNet.Based on the results from Section 5.3, for both the WD-17 and WD-Vigo datasets, the time window was set to 3 hours and the time step to 30 minutes.The competitive time series forecasting methods used include the traditional ARIMA model for time series forecasting, the decision tree ensemble algorithm XGBoost from machine learning, deep learning models like LSTM, and its variants, and the latest time series forecasting model, Informer.The dataset for each competitive model was carefully prepared, tailoring preprocessing steps to suit their specific requirements.For instance, for the ARIMA model, a differencing step was performed to stabilize the mean of the time series.Each model, including ARIMA, XGBoost, LSTM variants and Informer, underwent a rigorous configuration process, where hyperparameters were tuned using a grid search approach to identify the optimal settings.Finally, evaluation metrics such as MSE, MAE and MAPE were calculated to ensure a comprehensive and fair comparison of their performance on both datasets.
The following is a brief description of these models: • ARIMA [28]: A difference autoregressive moving average model, used for forecasting non-stationary time series.
• XGBoost [29]: An efficient gradient boosting decision tree algorithm, combining multiple weak learners into a strong learner through forward addition.
• GRU+Attention [33]: A model variant based on LSTM, merging the forget and input gates into an update gate, and emphasizing the importance of each hidden layer output with an attention mechanism.
• Informer [34]: Uses a new attention mechanism to automatically adjust the attention scope according to sequence length, effectively handling long sequences.It also adopts a multi-scale time encoder and decoder structure, considering information across different time scales.
The MSE, MAE and MAPE of each model in visibility prediction on the WD-17 and WD-Vigo datasets are shown in Tables 7 and 8.  [28] 0.0667 0.657 0.2726 XGBoost [29] 0.0371 0.343 0.2383 LSTM [30] 0.0254 0.132 0.1853 LSTM+CNN [32] 0.0223 0.086 0.1027 GRU+Attention [33] 0.0192 0.078 0.0895 Informer [34] 0 It is clear from the tables that for both datasets, the performance of neural network models in visibility prediction significantly surpasses that of the machine learning model XGBoost and the traditional time series forecasting model ARIMA.While LSTM and its variants display comparable performances, ABCNet distinctively combines Bi-LSTM, Attention, and CNN modules, which can effectively extract multivariate time series information and spatial feature information and adaptively allocate weights to each feature, significantly outperforming other models.On the WD-17 dataset, compared to the latest time series forecasting model architecture (Informer [32]), ABCNet reduced MSE and MAE by 1.13 and 3.3%, respectively.Similarly, on the WD-Vigo dataset, ABCNet's MSE and MAE were reduced by 1.92 and 5.3%, respectively.This demonstrates ABCNet's good universality, showing superior performance in visibility prediction tasks in various scenarios, including highways and airports.

Practical application system validation
The integration of the ABCNet model into the Highway Traffic Meteorological Intelligent Monitoring and Proactive Control System for Yunnan Communications Investment & Construction Group CO., LTD [35] represents a substantial stride forward.Deployed for validation in the Mazhao section managed by the Zhaotong office, the model has demonstrated its efficacy in enhancing visibility prediction, meeting the anticipated accuracy standards and facilitating local traffic management efforts.This deployment illustrates the model's practical application and its contribution to traffic safety and efficiency.The system's effectiveness is visually encapsulated in Figure 11, illustrating its operational impact.Given the foundational success of ABCNet in the deployment, further exploration into its application is warranted to fully leverage its capabilities for broader impact.The model's precision in visibility prediction offers a critical tool for enhancing road safety under adverse weather conditions, suggesting its potential utility in automated traffic management systems.By integrating ABCNet with dynamic traffic control algorithms, it is conceivable to develop more responsive systems that adjust traffic signals and speed limits in real-time based on visibility data, potentially reducing accident rates and improving traffic flow.Additionally, exploring the integration of ABCNet with vehicle-to-infrastructure communication technologies could pave the way for personalized driver alerts regarding visibility and road conditions.This expanded application scope promises to not only advance the state of intelligent transportation systems but also contribute to the development of smart cities, where transportation efficiency and safety are paramount.Further research in these areas could significantly enhance the manuscript's relevance and underscore the innovative contributions of ABCNet to the field.

Conclusions
In this study, the ABCNet model has been successfully developed, which is a novel Attention-based Bi-LSTM-CNN approach for accurate highway visibility prediction.The model harnesses the strengths of Bi-LSTM for bidirectional temporal feature extraction from complex meteorological time series data, complemented by CNN for in-depth spatial feature analysis.Integrating multi-head attention mechanisms further refines the model's capability to assign weights automatically to various features, enhancing its overall predictive accuracy.Another key contribution of this work is developing and utilizing a self-collected meteorological dataset, which significantly improved the model's predictive performance.
The comprehensive evaluations, utilizing metrics such as MSE, MAE and MAPE, demonstrate ABCNet's enhanced performance in visibility prediction over existing models.These results underscore ABCNet's potential as a robust tool for real-world traffic management applications, offering a significant advancement in meteorological forecasting for transportation safety.
The synergy between the novel dataset and the ABCNet methodology is pivotal for advancing visibility prediction.ABCNet's sophisticated design is rigorously tested and refined by the diverse and complex conditions presented in the high-altitude highway meteorological dataset.It is the dataset's detailed representation of challenging visibility scenarios that enhances the model's predictive capabilities.This collaborative approach between advanced methodology and comprehensive data enables a significant advancement in predicting highway visibility that neither could achieve in isolation.
The ABCNet architecture, employing widely adopted mechanisms like Bi-LSTM, CNN and attention mechanisms, provides a solid foundation for accuracy and reliability in various scenarios.However, its capacity for handling exceptionally complex environments or predicting over extended time step may encounter limitations due to the inherent challenges of modeling highly dynamic and unpredictable weather patterns.Future enhancements will explore advanced methodologies to further extend ABCNet's predictive capabilities, ensuring its applicability and effectiveness in more demanding forecasting contexts.
Furthermore, future work will explore incorporating image data from highway cameras into ABCNet to enhance its predictive accuracy by providing direct visual insights into visibility conditions.Additionally, the potential of ABCNet in predictive maintenance of transportation infrastructure under adverse weather conditions remains an exciting avenue for future exploration.By predicting visibility and other related weather parameters, ABCNet could be instrumental in guiding maintenance schedules, thereby improving road safety and longevity.In conclusion, these initiatives underscore ABCNet's potential in the broader domain of intelligent transportation systems.

Use of AI tools declaration
The authors declare that they have not used any Artificial Intelligence (AI) tools in the creation of this article.

Figure 2 .
Figure 2. Self-developed traffic meteorological data collection equipment.

Figure 3 .
Figure 3. Geographical locations of the 17 meteorological stations.

Figure 7 .
Figure 7.The structure of multi-head attention.

Figure 8 .
Figure 8.The loss curves under each prediction time step with a window size of 3 hours.(a) Time step 15 mins.(b) Time step 30 mins.(c) Time step 1 hours.(d) Time step 2 hours.

Figure 9 .
Figure 9. Model performance comparison on (a) the WD-17 dataset and (b) the WD-Vigo dataset.

Figure 9
Figure 9 clearly shows the performance advantage of the model, providing an intuitive comparison of various model metrics.The experiment on the WD-17 dataset was recorded, observing the comparison between actual and predicted values on the test set.Figure10displays the comparison results for four different time series intervals.The experimental results show that the model can predict sudden low visibility events and the changes in the predicted values are generally consistent with the actual values, demonstrating the model's good performance.In summary, the comparative analysis on both the WD-17 and WD-Vigo datasets demonstrates the robustness and adaptability of the proposed model across different data contexts.

Figure 10 .
Figure 10.Comparison of predicted and real values for four distinct time series intervals on the WD-17 dataset.

Figure 11 .
Figure 11.Practical demonstration of model visibility prediction performance in the Highway Traffic Meteorological Intelligent Monitoring and Proactive Control System.

Table 3 .
Parameter settings for the ABCNet model.

Table 4 .
Model's MAE under different time window and forecast time step combinations.

Table 5 .
Model's MSE under different time window and forecast time step combinations.

Table 6 .
Ablation experiment of the ABCNet model on WD-17 dataset.

Table 7 .
MSE, MAE and MAPE of various models on the WD-17 dataset.

Table 8 .
MSE, MAE and MAPE of various models on the WD-Vigo dataset.