A Novel Underwater Wireless Optical Communication Optical Receiver Decision Unit Strategy Based on a Convolutional Neural Network

: Underwater wireless optical communication (UWOC) systems face challenges due to the significant temporal dispersion caused by the combined effects of scattering, absorption, refractive index variations, optical turbulence, and bio-optical properties. This collective impairment leads to signal distortion and degrades the optical receiver’s bit error rate (BER). Optimising the receiver filter and equaliser design is crucial to enhance receiver performance. However, having an optimal design may not be sufficient to ensure that the receiver decision unit can estimate BER quickly and accurately. This study introduces a novel BER estimation strategy based on a Convolutional Neural Network (CNN) to improve the accuracy and speed of BER estimation performed by the decision unit’s computational processor compared to traditional methods. Our new CNN algorithm utilises the eye diagram (ED) image processing technique. Despite the incomplete definition of the UWOC channel impulse response (CIR), the CNN model is trained to address the nonlinearity of seawater channels under varying noise conditions and increase the reliability of a given UWOC system. The results demonstrate that our CNN-based BER estimation strategy accurately predicts the corresponding signal-to-noise ratio (SNR) and enables reliable BER estimation


Introduction
Underwater wireless optical communication (UWOC) systems are showing promise as low-cost, high-capacity, energy-efficient ways to transmit data at high speeds of up to multi-gigabits per second (Gbps) over distances of 10 to 20 m [1,2].Unlike traditional acoustic communication, UWOC offers higher bandwidth and lower latency, making it suitable for applications such as underwater exploration, environmental monitoring, and military operations [3].However, the performance of UWOC systems is significantly influenced by various impairments, including scattering, absorption, and turbulence, which collectively deteriorate the signal quality and increase the bit error rate (BER) [4].However, the challenges facing UWOC systems are becoming increasingly complex, necessitating effective solution options.One of these options is to optimise the optical receiver design circuitry [5] to render an optimum performance level for the overall receiver unit.
In an optical receiver, a decision unit (DU) with an accurate BER estimation is crucial to support the performance optimisation steps of the digital receiver systems in UWOC.
However, traditional BER estimation strategies, such as Monte Carlo simulations (MCSs) and analytical methods, are computationally intensive [6] and may not adapt well to the dynamics of the underwater environment.Pilot symbols and training sequences provide more real-time estimation but at the cost of reduced data throughput [7].Error Vector Magnitude (EVM) and noise variance estimation offer alternative approaches but are often limited by their assumptions about the channel conditions [6,7], which makes the estimation strategy highly dependent on the full knowledge of the channel impulse response (CIR) temporal profile.Monte Carlo simulations are flexible and can handle complex systems but are computationally expensive.Analytical methods are efficient, but their success depends on the model accuracy level, which the appropriate decision unit should avoid.Empirical methods provide real-world accuracy but are impractical for initial design, i.e., it is design time not run time implementation knowledge.Hence, the choice of BER estimation method depends on the specific numerical needs and constraints of the communication system being analysed.
Recent advancements in machine learning (ML) technology solution options (see Figure 1), particularly CNNs in optical performance monitoring, have introduced a new technology to help design an efficient computational processor (for the DU) that deploys a CNN-oriented strategy.CNNs can learn complex patterns embedded in eye diagrams (composed of various pixels) that are generated in real-time (see Figure 2).These patterns are learned from the bit stream received at the input of the DU, making them suitable for dynamically adapting to the varying conditions in UWOC systems.CNNs recognise patterns in visual data, making them suitable for processing eye diagram images-a critical visualisation tool used in digital communication systems to evaluate signal integrity and quality.Eye diagrams encapsulate key performance metrics, including timing jitter and noise levels, providing a comprehensive snapshot of the signal's health.By utilising the capabilities of CNNs, it is possible to develop a more robust and efficient decision unit strategy that improves BER estimation accuracy and overall system performance.This CNN-based DU strategy does not depend on the transmission modulation format, channel stochastic impairments, and the need to set a fixed threshold during the design phase to estimate the BER.The CNN training data pool is continuously enhanced without reducing the data throughput.Additionally, it is worth mentioning that the DU implementation should not require knowledge of the CIR during the design or run time.This CNN solution approach to building a high-performance DU is the core of this study.
Using CNN technology is not new in optical communications.CNN networks have been previously used in optical performance monitoring (OPM) to measure the parameters of optical systems such as chromatic dispersion (CD), modulation format identification (MFI), and signal-to-noise ratio (SNR) [8][9][10][11][12].Considering these measures to develop an affordable OPM system with great diagnostic capabilities is crucial.Further investigation and analysis are necessary to address the obstacles and issues that this field faces, including the natural factors in underwater environments and the accompanying phenomena, whether they are inherent optical properties (IOPs) such as absorption, scattering, and scintillation or apparent optical properties (AOPs), such as reflectance.Section 2 summarises the relevant published studies on ANNs, specifically focusing on CNNs.The purpose is to facilitate navigation and focus on implementing our proposed new CNN approach and the architecture and design elements.
This study first applies the CNN model directly on eye diagram images to predict SNR values through regression; subsequently, the BER is extracted from the SNR for UWOC receiver systems.The CNN model can provide accurate predictions at a reasonable cost, regardless of water type, pulse shape, and noise sources.It achieves a Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) in the range of 0.29-0.52 and 0.39-0.73,respectively, rendering it a fast and accurate way to assess the received signal inside the receiver of the UWOC system.Consequently, it becomes a core component in the decision unit, as depicted in Figure 3.The training of the CNN model is based on handling the nonlinearity of water channels under various noise environments, which helps identify and manage the UWOC systems' reliability even though the impulse response of the water channel is not yet fully characterised.Although the eye diagram images in this study have been generated from simulations, the model has proved that the concept can be conveniently expanded to assess real-time generated eye diagrams.
Our proposed tool is vital for researchers and communication engineers interested in UWOC because of the difficulty of measuring SNR in real-world scenarios.Thus, when a new pulse is received, an image of the corresponding eye diagram is generated at run-time.
Then, the ML model can learn and deduce the SNR value with high accuracy.This reliable approach deals with the nonlinearity of channels in underwater environments, such as multiple scattering, turbulence, scintillation, propagation time jitter, and multipath effect, which causes intersymbol interference (ISI) and receiver thermal noise.Furthermore, the decision unit will be an ML base unit using the NN to pick the best-matched image to make a decision.Because training high-performance ML applications and big processing units take a long time, the Microsoft Azure Virtual Machine (VM) was used in this study.This study also used various other resources, including Python 3.9.13,TensorFlow and Keras 2.12.0, cloud computing, SQLite Database Management System (SQLite DBMS), and eye diagram images.The essential features for signal processing in a UWOC system are seawater type, channel model, pulse shape, pulse width, and the zero-position symbol.Based on these features, the proposed algorithm generates eye diagram images.
This study employed a CNN approach to perform regression analysis on eye diagram images to determine SNR values and, subsequently, the corresponding BER.The CNN contained a flattened feature map as the input, a hidden layer with a ReLU activation function, and an output layer with a linear activation function.The CNN took eye diagram images as inputs and processed them through subsequent layers to obtain feature maps.The first layer is called the convolutional layer.This study conducted 13 trials, each utilising different CNN models with filters ranging from 16 to 64.
Additionally, the feature map from the convolutional layer underwent max pooling.Five iterations of the convolutional and max pooling layers were performed.The last feature map was flattened to obtain the input values entering the Fully Connected layer (FC).Additionally, dropout (a regularisation technique) was used in the FC layer of the NN to tackle the overfitting problem.The dropout rate used was 0.45.The Adam optimiser and a learning rate of 10 −5 were employed in all trials.Finally, the output layer of the CNN yielded predictions in the form of SNR values.In this study, our newly designed CNN instrument is equivalent to a computational processor for the optical receiver electronic circuitry decision unit.The training and validation loss exhibit minimal disparity, and the congruity in the performance metric suggests that the proposed model is more precise and comprehensive.
Additionally, if the neural network's size (the number of parameters) increases, the model's performance also increases.This study demonstrates that CNNs can make decisions using cost-effective functions with limited trainable parameters (ranging from 516,881 to 2,267,201).These decisions apply to various types of waters, even in the presence of ISI noise and fluctuations in the water environment.This study is organised as follows: Section 2 briefly reviews related studies.Section 3 reviews the basics of UWOC systems.Section 4 discusses the foundations of CNN modelling with basic theory.Section 5 presents the CNN algorithm design and implementation.Section 6 provides a comprehensive overview of the results of the SNR and BER predictions, the performance metrics, and the statistical summary of the obtained results.Sections 8 and 9 of the document discuss the conclusion and future studies, respectively.

Related Studies-A Brief Review
Artificial intelligence (AI) has caused significant reorganisation in many different industrial and scientific sectors as machines learn how to solve specific problems [13].Computer algorithms acquire knowledge of the fundamental connections within a provided dataset and autonomously detect patterns to make decisions or forecasts [14].Machine learning (ML) algorithms enable machines to implement intellectual activities by applying complex mathematical and statistical models [15].Specifically, supervised and unsupervised ML methods have played an essential and effective role in optical communication, especially in detecting impairments and performance monitoring in UWOC systems.ML methods are important interdisciplinary tools that utilise eye diagram images as feature sources in several fields, including computer vision [16], equalisation [17], signal detection, and modulation format identification [18].Some examples of techniques used in this study include Support Vector Machine (SVM) [19], k-means clustering to mitigate nonlinearity effects [20,21], Principal Component Analysis (PCA) for modulation format identification (MFI), optical signal-to-noise ratio (OSNR) monitoring [22], and Kalman filtering for OPM [23].Reference [24] indicated that CNNs could achieve the highest accuracy compared to five other ML algorithms: Decision Tree (DT), K-Nearest Neighbour (KNN), Back Propagation (BP), Artificial Neural Networks (ANNs), and SVM. Figure 1 depicts the various applications of ML algorithms used in optical communication.This study provides a comprehensive review of ML solution applications in UWOC technology.
Neural networks, such as ANNs, CNNs, and recurrent neural networks (RNNs) are highly suitable machine learning tools.They are capable of learning the complex relationships between samples or extracted features from symbols and channel parameters such as optical signal-to-noise ratio (OSNR), polarisation mode dispersion (PMD), polarisationdependent loss (PDL), baud rate, and chromatic dispersion (CD) [10,[25][26][27][28][29][30][31][32][33].The OSNR is a signal parameter that significantly impacts the effectiveness of optical links.The OSNR can be used to predict the bit error rate (BER), which directly gauges receiver performance [34].Reference [35] proposed and demonstrated a system for compensating for fibre nonlinearity impairment using a simple recurrent neural network (SRNN) with low complexity.This method reduces computational complexity and training costs while maintaining good compensation performance.
vised ML methods have played an essential and effective role in optica especially in detecting impairments and performance monitoring in UW methods are important interdisciplinary tools that utilise eye diagram sources in several fields, including computer vision [16], equalisation [ tion, and modulation format identification [18].Some examples of techn study include Support Vector Machine (SVM) [19], k-means clustering to earity effects [20,21], Principal Component Analysis (PCA) for modulati fication (MFI), optical signal-to-noise ratio (OSNR) monitoring [22], and for OPM [23].Reference [24] indicated that CNNs could achieve the high pared to five other ML algorithms: Decision Tree (DT), K-Nearest Neighb Propagation (BP), Artificial Neural Networks (ANNs), and SVM. Figure ious applications of ML algorithms used in optical communication.This comprehensive review of ML solution applications in UWOC technolog Neural networks, such as ANNs, CNNs, and recurrent neural netw highly suitable machine learning tools.They are capable of learning the ships between samples or extracted features from symbols and channel as optical signal-to-noise ratio (OSNR), polarisation mode dispersion (PM dependent loss (PDL), baud rate, and chromatic dispersion (CD) [10,25a signal parameter that significantly impacts the effectiveness of optical can be used to predict the bit error rate (BER), which directly gauges rece [34].Reference [35] proposed and demonstrated a system for compensat linearity impairment using a simple recurrent neural network (SRNN) w ity.This method reduces computational complexity and training costs w good compensation performance.Several methods based on automatic feature extraction can be used tures input into the neural network (NN).These methods use conste asynchronously amplitude histograms (AAHs), Asynchronous Delay Ta Asynchronous Single Channel Sampling (ASCS), In-phase Quadra (IQHs) for SNR, and other parameter estimations.Here, we review the v display a range of OPM works utilising machine learning techniques to f noise ratio (SNR) values through various approaches.In [36], a new m OPM method is proposed that uses support vector regressors (SVRs) phase Quadrature Histogram (IQH) features to estimate several optica cluding signal-to-noise ratio (SNR) and chromatic dispersion (CD).Several methods based on automatic feature extraction can be used to obtain the features input into the neural network (NN).These methods use constellation diagrams, asynchronously amplitude histograms (AAHs), Asynchronous Delay Tap Plots (ADTPs), Asynchronous Single Channel Sampling (ASCS), In-phase Quadrature Histograms (IQHs) for SNR, and other parameter estimations.Here, we review the various works that display a range of OPM works utilising machine learning techniques to forecast signal-to-noise ratio (SNR) values through various approaches.In [36], a new machine learning OPM method is proposed that uses support vector regressors (SVRs) and modified In-phase Quadrature Histogram (IQH) features to estimate several optical parameters, including signal-to-noise ratio (SNR) and chromatic dispersion (CD).A deep learning algorithm in ref. [37] has been successfully applied in wireless communications, but it often results in challenging nonlinear problems.An ANN algorithm in [26] was developed to calculate the signal-to-noise ratio (SNR) using On-Off Keying (OOK) and Differential Phase Shift Keying (DPSK) data.The training errors for OOK and DPSK were 0.03 and 0.04, respectively.The ANN is trained by sending a series of well-known symbols before being used as an equaliser.The parameters are modified to reduce discrepancies between the desired and ANN outputs [38].Improving the ANN on the receiver can bring several benefits, such as reducing training time and complexity, maintaining high performance, achieving high data rates and bandwidth transmission capabilities, improving efficiency, and enhancing multipath delay robustness.Various studies have highlighted these advantages, including references [8,12,31,[39][40][41][42][43].A 10 Gbps NRZ modulation scheme measures the SNR using statistical parameters such as means and standard deviations obtained from the ADTP.The RMSE of the ADTP is 0.73.In [31,[44][45][46][47][48][49], a DNN was employed to classify SNR from AAH and 16-QAM PDM-64QAM with an accuracy that can reach 100%.OSNR monitoring from 10 to 30 dB was achieved using 10 Gb/s NRZ-OOK and NRZ-DPSK from ASCS.Constellation diagrams were used in [50,51] to estimate SNR with errors less than 0.7 dB by designing CNNs with QPSK, PSK, and QAM modulation formats.The fundamental CNN algorithm for SNR estimation is presented in [52], and methods for preprocessing received signals and selecting optimal parameters are provided.The technique efficiently and accurately identifies the modulation format and estimates SNR and BER using 3D constellation density matrices in Stokes space.
Eye diagrams have been utilised in the literature to track OSNR, PMD, CD, nonlinearity, and crosstalk via NNs [53][54][55].Observing an eye pattern involves various optical communication noises simultaneously (e.g., thermal noise, time jitter, and ISI); consequently, SNR decreases, and the signal declines when the noise levels rise.SNR is used to investigate the quality of the received signal in communication systems.Reference [56] presents a long short-term memory (LSTM)--based deep learning to simultaneously estimate SNR and nonlinear noise power.The test error is less than 1.0 dB, and the modulation types include QPSK, 16QAM, and 64QAM.The SNR monitoring method suggested in ref. [57] uses an LSTM neural network, a classifier, and a low-bandwidth coherent receiver to convert continuous monitoring into a classification problem.It is cost-effective and suitable with multi-purpose OPM systems because it achieves excellent classification accuracy and robustness with minimal processing complexity.
The eye diagram used to locate optical signal impairments by overlapping the symbols depicts the amplitude distribution over one or more-bit periods.SNR and BER indicate how well a system performs by assessing the signal quality based on various properties: eye height, eye width, jitter, cross percentage, and levels 0 and 1 (Figure 2).
the desired and ANN outputs [38].Improving the ANN on the receiver can benefits, such as reducing training time and complexity, maintaining high achieving high data rates and bandwidth transmission capabilities, improv and enhancing multipath delay robustness.Various studies have highligh vantages, including references [8,12,31,[39][40][41][42][43].A 10 Gbps NRZ modul measures the SNR using statistical parameters such as means and standa obtained from the ADTP.The RMSE of the ADTP is 0.73.In [31,[44][45][46][47][48][49], a D ployed to classify SNR from AAH and 16-QAM PDM-64QAM with an accu reach 100%.OSNR monitoring from 10 to 30 dB was achieved using 10 Gb and NRZ-DPSK from ASCS.Constellation diagrams were used in [50,51] to with errors less than 0.7 dB by designing CNNs with QPSK, PSK, and QAM formats.The fundamental CNN algorithm for SNR estimation is presente methods for preprocessing received signals and selecting optimal param vided.The technique efficiently and accurately identifies the modulation fo mates SNR and BER using 3D constellation density matrices in Stokes spac Eye diagrams have been utilised in the literature to track OSNR, PMD arity, and crosstalk via NNs [53][54][55].Observing an eye pattern involves v communication noises simultaneously (e.g., thermal noise, time jitter, an quently, SNR decreases, and the signal declines when the noise levels rise.S investigate the quality of the received signal in communication systems.R presents a long short-term memory (LSTM)--based deep learning to simult mate SNR and nonlinear noise power.The test error is less than 1.0 dB, and tion types include QPSK, 16QAM, and 64QAM.The SNR monitoring meth in ref. [57] uses an LSTM neural network, a classifier, and a low-bandwidt ceiver to convert continuous monitoring into a classification problem.It is and suitable with multi-purpose OPM systems because it achieves excellen accuracy and robustness with minimal processing complexity.
The eye diagram used to locate optical signal impairments by overlap bols depicts the amplitude distribution over one or more-bit periods.SNR cate how well a system performs by assessing the signal quality based on va ties: eye height, eye width, jitter, cross percentage, and levels 0 and 1 (Figur  SVM for classification and NN for regression were studied in ref. [25][26][27]38] using 64-QAM, 40 Gb/s RZ-OOK, 10 Gb/s NRZ-OOK, and DPSK.The input features from the eye diagram are mean, variance, Q-factor, closure, jitter, and crossing amplitude.The ANN reports a correlation coefficient of 0.97 and 0.96 for OOK and DPSK systems, respectively [58]. Mathematics 2024, 12, 2805 6 of 37 NN regression was developed to extract variance from eye diagram images, and SNR with a range from 4 to 30 dB was measured with a mean estimation error range of 0.2 to 1.2 dB for 250 km [25].Another study used an ANN to extract 24 features from eye diagram images.The RSME values ranged from 1.5 to 2 for SNRs between 10 and 30 dB using NRZ, RZ, and QPSK for a data rate of 40 Gb/s [59].Table A1 (in Appendix A) represents the studies from 2009 to 2024 that used ML to extract features from eye diagram images to obtain signal-to-noise ratios; the table also shows the implementations of the NN algorithms and model performance and compares these studies with ours.References [24,60,61] demonstrated the CNN-based algorithms on eye diagram images and discussed the CNN structure and implementations in detail.These studies have generated eye diagrams by run-time simulation or experimental setup and used classification techniques to obtain the SNR (see Table A2 in Appendix A).What is crucial to note is that while our study has created and implemented a new CNN structure for UWOC, previous efforts have primarily focused on optical fibre.Our approach to estimating SNR directly from eye diagrams, which involves 13 regression CNN models, is at the heart of the novelty of our study.

UWOC System Model
In the following sections, we will introduce our study as an innovative method for rapidly estimating bit error rate (BER) in technology.However, before doing so, we will provide concise explanations of two topics to help the audience understand the underlying challenges this study aims to address.These topics are (1) the digital signal evaluation cycle, in which the digital signal transforms from an optical digital signal on the transmitter (Tx) side to an electronic digital signal as an output of the optical receiver (Rx), and (2) the conventional BER estimation regimes, which include some familiar approaches: modified Monte Carlo (MC)-based estimation methods, the MC prediction method, and the Log-Likelihood Ratio-based BER model.The UWOC system generally consists of three fundamental components, as depicted in Figure 3: the transmitter unit, the water propagation channel, and the receiver section. of 0.2 to 1.2 dB for 250 km [25].Another study used an ANN to extract 24 fea eye diagram images.The RSME values ranged from 1.5 to 2 for SNRs between dB using NRZ, RZ, and QPSK for a data rate of 40 Gb/s [59].Table A1 (in Ap represents the studies from 2009 to 2024 that used ML to extract features from ey images to obtain signal-to-noise ratios; the table also shows the implementat NN algorithms and model performance and compares these studies with ours.[24,60,61] demonstrated the CNN-based algorithms on eye diagram images and the CNN structure and implementations in detail.These studies have generat grams by run-time simulation or experimental setup and used classification tec obtain the SNR (see Table A2 in Appendix A).What is crucial to note is that study has created and implemented a new CNN structure for UWOC, previ have primarily focused on optical fibre.Our approach to estimating SNR dir eye diagrams, which involves 13 regression CNN models, is at the heart of the our study.

UWOC System Model
In the following sections, we will introduce our study as an innovative m rapidly estimating bit error rate (BER) in technology.However, before doing provide concise explanations of two topics to help the audience understand th ing challenges this study aims to address.These topics are (1) the digital signal cycle, in which the digital signal transforms from an optical digital signal on th ter (Tx) side to an electronic digital signal as an output of the optical receiver (R the conventional BER estimation regimes, which include some familiar approa ified Monte Carlo (MC)-based estimation methods, the MC prediction metho Log-Likelihood Ratio-based BER model.The UWOC system generally consis fundamental components, as depicted in Figure 3: the transmitter unit, the wa gation channel, and the receiver section.The photons propagate across the water in the underwater communicati independently from each other through any medium, facing different sequen optical events: transmission, absorption, and scattering (elastic and inelastic).T on the transmitted optical signal includes attenuation, temporal and spatial bea ing, deflection of its geometrical path, and amplitude and phase distor The photons propagate across the water in the underwater communication channel independently from each other through any medium, facing different sequenced sets of optical events: transmission, absorption, and scattering (elastic and inelastic).The impact on the transmitted optical signal includes attenuation, temporal and spatial beam spreading, deflection of its geometrical path, and amplitude and phase distortions [63].Degradation, such as absorption and scattering, significantly impacts the UWOC's performance [59].Turbulence is another degrading factor that causes beam spreading, beam wander, beam scintillation, and link misalignment.Oceanic water types can be classified as follows [64]: clean ocean water, pure sea water, turbid harbour water, and coastal ocean water.Furthermore, in turbid harbour water, several photons may arrive at the receiver with delays, intersymbol interference (ISI), and fading signal, reducing communication viability [65].

The Transmitter Unit
The transmitter unit utilises a beam-shaping optical unit to interface with the water propagation channel.The transmitter unit's modulator provides the needed modulation shaping characteristics to generate the information bit stream.Moreover, in UWOC systems, the driver circuit is another crucial part of the transmitter unit [66].The main job of this device is to convert the electrical signal from the modulator into an optical signal that can be transmitted through the water channel.The driver circuit typically consists of a laser or LED driver that provides the necessary current to the light source, which emits the optical signal s(t).The selection of the light source and driver circuit is determined by the particular system requirements, such as the desired data rate, transmission distance, power consumption [67], and optical characteristics of the water channel [68].In the UWOC system, the LED setup is more affordable and straightforward, but the connection range is very constrained because of the incoherent optical beam and light spread in all directions [4].Laser diodes are often used as the light source in UWOC systems due to their long ranges, high intensity of output power, improved collimation characteristics, narrow beam divergence [69], high efficiency, small size [70], high data rates, and low latencies [4].The high-quality output of the coherent laser beam is quickly degraded by turbulence and underwater scattering.The laser-based UWOC system may reach a link distance of 100 m in clear water and 30 to 50 m in turbid water, while the LED-based UWOC may cover a linkspan of no more than 50 m [69].

UWOC Propagation Channel
In UWOC systems, water is the communication channel via which the optical signal s(t) propagates.One of the challenges of the UWOC is that there is no definite mathematical expression for the impulse response function (h c (t)).Hence, h c (t) must be reliably modelled to assess the scope of impacts on the propagated s(t) due to water channel impairments like absorption, single/multiple scattering, scintillation, and turbulence.These degradations degrade s(t) temporal and spatial quality, reducing the received OSNR [71] at the surface of the photodetector.There are many studies (e.g., [70][71][72]) that focus on solving the radiative transfer equation (RTE) analytically and numerically, which account for different sets of inherent optical properties (IOPs) that mainly include absorption and scattering.The analytical solutions of the RTE are based on a wide range of assumptions or rather simplifications.These solutions are considered benchmark limits for the numerical ones.The simplest and most well-known benchmark is the Beer-Lambert law (BLL) [71][72][73].The main aim of numerical solutions is to conclude an extrapolated mathematical close form using a double gamma curve-fitting to conclude a temporal profile for h c (t), which accounts for impairments' impact limits for different water types and given link configurations.Once the h c (t) format is defined, we will be able to conclude the convolution s(t) with h c (t), the product of which is the optical received signal (r opt (t)).
In this study, we utilised the following h c (t) format versions: (1) double gamma functions (DGFs), (2) weighted double gamma functions (WDGFs), (3) a combination of exponential and arbitrary power functions (CEAPFs), and (4) Beta Prime (BP).The impairment scope of each h c (t) model is shown in Table 1.The CEAPF and BP formats might look different from the foundation DGF but can be reduced back to the DGF.

Model Name
The Equation of the Model Ref.

DGF
The closed-form expression of the double gamma functions (DGFs) is given as follows:

WDGF
The weighted double gamma functions (WDGFs) model is given as follows:

CEAPE
A combination of exponential and arbitrary power functions (CEAPEs) is given as follows:

BP
Beta Prime (BP) distribution is given as follows: , and β 2 are double gamma curve-fitting parameters.v is the light velocity for the seawater medium under consideration.L is the linkspan distance between the Tx and Rx.

The Receiver Unit
An optical detection system or a receiver is one of the main components of UWOC.The optical signal will go through an optical filter and focusing lens on the receiver side.The photon detector will then capture it.Since a photodiode can only transform light intensity variations from an LD or laser into corresponding current changes [78,79], a trans-impedance amplifier is cascaded in the following stage to convert current into voltage.The transformed voltage signals will then go through a low-pass filter responsible for shaping the voltage pulse to reduce the thermal and ambient noise levels without causing significant inter-symbol interference (ISI) [80,81].
Further signal processing programmes are bypassed through a signal quality analyser for demodulation and decoding [82].An equalisation is a tool used to reshape the incoming pulse, extract the timing information (sampling), and decide the symbol value.A PC or BER tester will finally collect and analyse the recovered original data to evaluate several important performance parameters, such as BER.In optical receivers, many types of photodetectors can be used; for more details, see ref [62].The most functional OWC systems use a PIN or an avalanche photodiode (APD) as a receiver [83].The UWOC receiver system must meet specific requirements to address the effects of noise and attenuation.The receiver's most significant parameters are a large FOV, high gain, fast response time, low cost, small size, high reliability, high sensitivity and responsivity at the operating wavelength, and high SNR [83].The APD can provide higher sensitivity and gain faster response times.It could also be used in longer UWOC links (tens of metres) and wider bandwidths but at a much higher cost and complex circuits.The noise performance of these two devices is the most significant difference.The main source of noise in PIN photodiodes is thermal noise, while in APD, it is shot noise [79,82,[84][85][86].
However, the PIN photodiode appears to be a more favourable technology for shorter wavelengths than the APD for the UWOC system [4].To process and understand the received data, the decision unit in an optical receiver converts the signal into discrete binary values.It compares the sampled voltage to a reference level or threshold (D th ).With the use of the received optical signal, this procedure estimates the underlying BER based on which a decision will be made if the bit is "0" or "1" for binary s(t) [87][88][89].

Digital Signal Evaluation Cycle from Optical to Electronic-A Mathematical Viewpoint
The digital signals evaluation cycle is based on the illustration in Figure 4.The main components of a typical optical receiver system (Rx), as explained in Section 3.2, include a photodetector, preamplifier/amplifier, filter/equaliser, and decision unit (DU).The received optical signal r opt (t) can be described as follows: where ⊗ denotes the convolution operation.The ropt(t) is the received optical signal.In this study, we consider a binary direct detection Rx, as depicted in Figure 4. Without losing generality, we will assume that the Rx includes a PIN photodetector with an internal gain (g) equal to one.The photodetector converts the input photons of ropt(t) into electronic signal rsig(t) =, which can be expressed as follows: where {tj} denotes the photoelectron emission times.Therefore, the filter electronic signal output rf (t) is If we assume that hd(t) = δ(t), then Equation (3b) takes the following form: We should note that the assumption of hd(t) = δ(t) is valid for modern fast-response PIN detectors.The first term is the signal component rsig(t), and the second term rth(t) is the AGTN.In Equation ( 3), {tj} is the set of photoelectrons' arrival times governed by Poisson statistics.N(t) represents the stochastic counting process, which is an inhomogeneous process with a time-varying rate intensity of {ak} in Equation (1).
It is expected that the DU in Figure 4 will be able to estimate rf (t) as an accurate replica of s(t).The accuracy and speed of decision-making, which is comparable to the function In a typical optical digital communication system, the transmitted optical signal s(t) can be represented as follows [5]: where T is the signalling period, for a binary (OOK) signal format, if τ is the timespan of each bit within a symbol, then T = τ, so 1/T is the bit rate.a k is the energy received in the k th symbol; for binary system a k ∈ {0, 1}, the h p (t) is the transmitted optical pulse.Typically, the s(t) experiences temporal and spatial distortions while propagating through a medium channel (air, fibre optic, or water), depending on the profile of the propagation channel impulse response h c (t).The received optical signal r opt (t) is the footprint of the convolutional impact of h c (t) on the s(t).Hence, r opt (t) can be expressed as follows: where ⊗ denotes the convolution operation.The r opt (t) is the received optical signal.In this study, we consider a binary direct detection Rx, as depicted in Figure 4. Without losing generality, we will assume that the Rx includes a PIN photodetector with an internal gain (g) equal to one.The photodetector converts the input photons of r opt (t) into electronic signal r sig (t) =, which can be expressed as follows: where {t j } denotes the photoelectron emission times.Therefore, the filter electronic signal output r f (t) is If we assume that h d (t) = δ(t), then Equation (3b) takes the following form: We should note that the assumption of h d (t) = δ(t) is valid for modern fast-response PIN detectors.The first term is the signal component r sig (t), and the second term r th (t) is the AGTN.In Equation (3), {t j } is the set of photoelectrons' arrival times governed by Poisson statistics.N(t) represents the stochastic counting process, which is an inhomogeneous process with a time-varying rate intensity of {a k } in Equation (1).
It is expected that the DU in Figure 4 will be able to estimate r f (t) as an accurate replica of s(t).The accuracy and speed of decision-making, which is comparable to the function s(t), heavily relies on the computational processing strategy of the DU.This strategy minimises the BER to meet the receiver's performance goals.

BER Computational Strategies for Decision Unit-Background Tutorial
In this section, before presenting our novel technique for CNN-based bit error rate (BER) estimation technology, we need to briefly present tutorial descriptions for the conventional BER estimation regimes, which include some familiar approaches: the Monte Carlo (MC) prediction method, the Log-Likelihood Ratio-based BER model, and the modified MC-based estimation approaches.

BER Estimation Schemes-A Brief Review
Many techniques may be used to conclude bit error rate estimation.This subsection provides a synopsis of the conventional MC simulation to reveal that the execution time for low BER is very long.This subsection presents three techniques: quasi-analytical estimation, importance sampling theory, and tail extrapolation probabilities.
Such solutions demand assumptions regarding the actual system behaviour, and the effectiveness is greatly dependent on the presumed parameters, which likely have to be altered for various systems of communication.Predominantly, finding the ideal model or suitable parameters is not easy.Next, a number of new BER estimators based on the LLR distribution were introduced; nevertheless, they have a few shortcomings, such as being dependent on the SNR uncertainty estimation and the specific channel features.However, all the aforementioned approaches demand awareness of the transmitted bit stream, while the estimator certainly does not know transmitted data in practical situations.In contrast, our new CNN imaging computational processor requires no prior information.

Monte Carlo (MC) Method Simulation
The MCS approach is predominantly used for BER estimation in communication systems [90,91].This estimation approach is implemented by passing N data symbols through a model, which reflects the influencing features of the underlying digital communication system by counting the error numbers that take place at the receiver.The simulation run includes noise sources, pseudo-random data, and device models, which process the digital communication signal.In conclusion, the MC simulation processes a number of symbols, and eventually, the BER is estimated.
Let us assume that we have a standard baseband signal model representation, as shown in Equation ( 2), and the decision unit is using the Bernoulli decision function I(a k ) expressed as follows: where a k ∈ {0, 1} as defined in Section 3.4, and the ˆsign refers to the assembled average of the variable.Accordingly, the BER can be indicated in terms of the probability of error p e as follows: where E[.] is the expectation operator, P( âk ̸ = a k ) is the probability that the instant value of a k does not equal its average âk .If we take into consideration the entire stream of symbols in Equation (2b), then the BER is estimated by utilising the ensemble average of p e : where K is the maximum number of symbols (the bit stream size) in Equation (2b).Equation (7) helps to determine the estimation error, and its variance will be given as follows: This means that the variance of ε can be expressed as: Hence, we can write the normalised estimation error as follows: For a small BER, Equation (8b) can be simplified to Here, σ n is an indicator for the target accuracy we must aim at.Consequently, we can determine the required K for a target performance as given below: Equation (8d) indicates that a small BER value requires a large, simulated signal bit stream.For example, to configure a system with a BER of 10 −6 , we require no less than 10 8 bits in the signal stream.This numerical limitation ensures that the MC simulation trial size will satisfy the central limit theorem.This operational limitation means the decision unit will take a long time to estimate a trusted value of BER.Accordingly, MC simulation is impractical for a baud rate larger than 100 MBit.It is worth mentioning that in our discussion, we assumed that the bit errors were independent.

Importance Sampling Scheme
As earlier concluded, a small BER demands a large K. From a DU point of view, this is considered a fatal limitation of MC implementation, specifically for spread spectrum (SS) systems [92] (such as CDMA systems) in which every transmitted bit must be modulated via the SS code with an abundance of bits.
A modified MC method called the importance sampling (IS) method can be utilised to decrease BER simulation complexity for SS systems [93].Further, ref. [94] introduces a method for estimating the bit error rate (BER) based on IS applied to trapping sets.Considering the IS approach, the noise source statistics in the system are biased so that bit errors occur with greater p e , thus minimising the needed execution time.For instance, for a BER equal to 10 −5 , practically, we artificially degrade the performance of the channel, pushing the BER to 10 −2 .
To explain the IS approach, let g(•) be the original noise probability density function (PDF) and let g*(•) be the rising noise PDF utilising an external noise source.Hence, the weighting coefficient can be expressed as follows: For a simple threshold-dependent decision element, the expression that describes an error takes place as soon as there is a significant excursion of the threshold D th as follows: Then, p e is given as follows: I(x) is an indicator function, which equals 1 when an error takes place; otherwise, it equals 0. Hence, we can express that with regard to the natural estimator of the expectation (i.e., sample mean) as follows: Hence, concerning the PDF of the noise (i.e., r th (t) in Equation ( 4)) and using Equations (9c) and (9d), we obtain Equation ( 9e) is not just a mathematical expression; it represents the noise processes statistics that influence, and the prediction is achieved with regard to g*(•).As in the preceding subsection, we may attain the estimator using the sample mean; Regarding Equation (9d), in Equation (9f), the weight parameter w(x) needs to be evaluated at a k .This means reducing the σ ϵ , which can be accomplished by establishing external noise of biassed density.
IS-based BER estimation performance relies crucially on the biassing scheme w(x).An accurate estimate of the BER can be attained with a brief simulation run time if a good biassing scheme is configured for a specified receiver circuitry system.Contrarily, the BER estimate might even converge at a slower rate than the conventional MC simulation.This implies the IS technique must not be regarded as a generic approach for estimating every receiving system's bit error rate (BER).

Tail Extrapolation Scheme
We should keep in mind that the BER estimation obstacle, in essence, is a numerical integration problem if we regard the eye diagram (ED) in Figure 5, measured for an experimental system with SNR = 20 dB.It is possible to determine the worst case of the received bit sequence.
Mathematics 2024, 12, x FOR PEER REVIEW 12 of 3  = () () (9c I(x) is an indicator function, which equals 1 when an error takes place; otherwise, i equals 0. Hence, we can express that with regard to the natural estimator of the expecta tion (i.e., sample mean) as follows: Hence, concerning the PDF of the noise (i.e., rth(t) in Equation ( 4)) and using Equa tions (9c) and (9d), we obtain Equation (9e) is not just a mathematical expression; it represents the noise processe statistics that influence, and the prediction is achieved with regard to g*(•).As in the pre ceding subsection, we may attain the estimator using the sample mean; Regarding Equation (9d), in Equation (9f), the weight parameter w(x) needs to b evaluated at ak.This means reducing the σє, which can be accomplished by establishing external noise of biassed density.
IS-based BER estimation performance relies crucially on the biassing scheme w(x) An accurate estimate of the BER can be attained with a brief simulation run time if a good biassing scheme is configured for a specified receiver circuitry system.Contrarily, the BER estimate might even converge at a slower rate than the conventional MC simulation.Thi implies the IS technique must not be regarded as a generic approach for estimating every receiving system's bit error rate (BER).

Tail Extrapolation Scheme
We should keep in mind that the BER estimation obstacle, in essence, is a numerica integration problem if we regard the eye diagram (ED) in Figure 5, measured for an ex perimental system with SNR = 20 dB.It is possible to determine the worst case of th received bit sequence.When we regard the PDF of the eye section in lines A and B, the lower bound on th PDF (green line) is the worst-case bit sequence, and the small red area contains all of th bit errors.The BER of the given system can be thought of as the area under the tail of th When we regard the PDF of the eye section in lines A and B, the lower bound on the PDF (green line) is the worst-case bit sequence, and the small red area contains all of the bit errors.The BER of the given system can be thought of as the area under the tail of the probability density function.
Generally, we could not depict the sort of distribution to which the slopes of the bathtub curve in ED belong.However, we may presume that the PDF file is affiliated with a specific class and then accomplish curve-fitting on the obtained data.That technique for estimating the bit error rate (BER) is known as the tail extrapolation (TE) method [95].
When we set multiple thresholds for the lower bound, the number of times the decision metric surpasses every D th is recorded, and a standard MC simulation can be executed.A wide category of PDFs is then detected.The tail region is typically identified by certain members of the Generalised Exponential Class (GEC) and is identified as follows: where Γ(•) is the gamma function, µ is the mean of the distribution, and σ is related to the variance V υ through where the parameters (v, σ, µ) are then adjusted to find the PDF that best fits the data sample; therefore, the BER could be estimated via the integral evaluation of the PDF for D th .Nevertheless, which class of PDF and which D th should be selected is not frequently clear.Generally, it is hard to evaluate the estimated BER accuracy [95].

The Method of Quasi-analytical Estimation
The abovementioned methods analyse the received signal components (data and noise) at the receiver's output.At this point, we consider solving the BER estimation problem utilizing the succeeding two stages: 1.One handles the transmitted signal r f (t) in Equation (4); 2. The other handles the noise component r th (t).
First, we presume that the noise is denoted as the Equivalent Noise Source (ENS) and, second, that the ENS probability density function is known and determinable.
Therefore, we can assume that an ENS with an appropriate distribution can closely evaluate the receiver's performance.This approach is known as quasi-analytical (QA) estimation [96].We can calculate the BER with ENS statistics using the noiseless waveform.More precisely, we can allow the simulation to calculate the influence of signal changes in the non-existence of r th (t) and superimpose the r th (t) on the noiseless signal component.
The noise statistics assumption results in a significant drop in computation run time.Nevertheless, this may create a risk of complete miscalculation.The appropriateness of the QA estimation will rely on how well the assumption matches actuality [97].Hence, predicting ENS statistics before they occur for a linear system may be challenging.

Estimating BER Based on the Log-Likelihood Ratio
A receiver can implement soft-output decoding to reduce the signal stream's BER (e.g., a posteriori probability (APP) decoder).The APP decoder may output probabilities or Log-Likelihood Ratio (LLR) values.Let (a k )1≤ k ≤ K ∈ {+1,−1} be the bit stream and let X k ; k = 1, 2, . .., K represent the received values.Hence, the definition of LLR can be expressed just as follows: Hence, when using Baye's theorem, we obtain the following: In Equation (11b), the first term on the RHS represents a priori information, and the second represents channel information.The hard decision expression is implemented by computing the LLR sign as follows: In [98], some basic properties of LLR values are extracted, and new BER estimators are proposed based on the statistical moments of the LLR distribution.If we are examining the succeeding criterion: Solving Equation (11b) utilising the criterion mentioned above permits us to derive a posteriori probabilities P(a k = +1|x k ) and P(a k = −1|x k ); then, we can write the following: P(a k = +1|x k ) = e (LLRk) /1+ e (LLRk) and P(a k = −1|x k ) = 1/1+ e (LLRk)  (11d) If LLR k = A, then we could infer the probability that the hard decision of bit k th is wrong and Now, the BER estimate can be expressed as follows: The constraints of the LLR method are as follows: 1.The first estimate of BER given by Equation (12a) may not be as efficient as the second BER estimate given by Equation (12b) since g A (y) is usually Gaussian and smooth.2. The second estimator is extra complicated to execute because an estimate of g A (y) ought to be computed (for instance, utilising a histogram) prior to the integral.3.Both methods are sensitive to channel noise variance as the LLR distribution vigorously relies upon the accuracy of the SNR estimate.We should note that the earlier estimators implicitly presume that the SNR is well-known to the decoder.

Receiver Performance Indicators
This study highlights the receiver's performance by modelling a decision unit strategy.The UWOC receiver unit performance is influenced by the water channel signal impairments and various noise sources on the receiver side, such as electronic thermal, optical background, dark current, and shot noise-reference [4] reviewed such noise sources.Generally, the leading performance indicators for a digital receiver are SNR and BER.The SNR is represented by the following: where P S and P N represent the signal power and noise power, respectively.
The BER is defined as the probability of incorrect identification of a bit by the decision circuit of the underlying receiver [81].It is one of the most important metrics for assessing signal quality and estimating communication system performance.If the number of error bits received is N e and the total bits is N t , then the BER is as follows: The relation between SNR and BER is embedded in the following formula:

Test Data
In this study, eye diagrams and their SNRs are used as the source of feeding the CNN with the required testing data.A Python code was written and ran on a Microsoft Azure VM to generate 576 received pulses, including some random noise; then, the eye diagram pattern images were drawn, and their related SNRs based on the received pulses were calculated.For eye pattern generation, we used four-channel models, including the following: We also utilised two transmitted pulse shapes, Gaussian and Rectangular, to implement a binary OOK modulation in our simulation.The range of pulse widths (FWHM) was 0.1 to 0.95.It is worth mentioning that pulse widths beyond 0.6 are unrealistic, but we added these scenarios as a "burn test" for our solution.The ranges of the FOV values across channel models were not similar because we had to use the published double gamma fitting parameters (shown in the last row of Table 1) and their corresponding FOV value ranges.
After that, the names of the images and SNR values were stored in an SQL database, while the images were stored in one folder.Consequently, the data were ready to have CNNs applied to them.These test data have some properties, and they are as follows: The background of the eye diagram images is black, while the diagram itself is white (greyscale) to speed up and simplify the CNN calculations.All the eye diagram images' sizes (height × width) are 2366 × 3125; this size was taken from the shapes of the images' arrays (it is already an output from the code).The SNR has a normal distribution (which means there is no bias in our data before applying ML), as shown in Figure 6.The minimum SNR value is 0.5723, the maximum SNR value is 8.1478, the mean is 3.0004, and the standard deviation is 1.5061.
The data preprocessing steps before applying the CNN: 1.
Loading the images' names and SNRs from the database.

2.
Converting the data into a 'pandas' data frame.

3.
Shuffling the data frame.

4.
Using TensorFlow library on Python, we conducted the following: • Loaded the eye diagram images based on their names and normalised them using max normalisation (dividing each pixel by 255).

•
Converted the colour mode from RGB into grayscale.

•
Used the images' original size instead of resizing them to keep the resolution high.
sizes (height × width) are 2366 × 3125; this size was taken from the shapes of the ima arrays (it is already an output from the code).The SNR has a normal distribution (w means there is no bias in our data before applying ML), as shown in Figure 6.The m mum SNR value is 0.5723, the maximum SNR value is 8.1478, the mean is 3.0004, and standard deviation is 1.5061.

Machine Learning-Neural Networks (NNs)
Neural networks (NNs) are computing algorithms that include processing units known as neurons that are organised into layers.These layers are connected via weights; each cell has a different weighted function.Many researchers have investigated neural networks since the 1960s [99][100][101].NNs were developed based on how biological nerves transmit information and analyse data, and they are mainly used to increase computing performance [102].NNs can be used for supervised learning in both classification and regression, and they can also be used in unsupervised learning.
A general NN structure consists of at least the input of data and the output layer, which allows NNs to make predictions on new input middle-level layers known as hidden layers, which process the outputs of previous layers [26].The neurons have various coefficients, such as bias (θ 0 ) and weights (θ i ), which are modified during the training process to obtain the optimum values that make the loss as low as possible.The correlations between input-output datasets that constitute the attributes of the device or system under study are discovered using NNs.The model outputs are compared to the true desired outputs, and the error is calculated [103].For the training phase, the sample is represented as (x, y), where the input and output are x and y, respectively.Each node makes calculations on the (x) values that are entered into the neural network, and then the value of (z (L) ) is obtained.After that, the expected values of (a (L) ) are found via applying the activation function f (x) on (z (L) ); the process is repeated as represented by the following equations [102]: where z (L) is the predicted output of each layer, which is the input for the next layer.
The form of the hypothesis function or activation function (the final output in the last layer) is represented as follows: where i is the cell number, L is the layer number, l is the last layer number, and f is the activation function.Note that Equation ( 15) is similar to linear regression in which a predictor x variable and a dependent y variable are included in the model, and they are linearly related to one another.In this case, the output variable y was predicted based on the input variable x.The linear regression model is represented by the equation shown below: Equation ( 18) is the foundation equation for NNs, where θ i or m is the slope or rate of predicted y based on the best-fit line and θ 0 or c is y-intercept.Figure 7  The most significant and often utilised component of neural networks is data pro gation in both the forward and the reversed (or back) directions.This propagation is cial for performing quick and efficient weight adjustments.The term "forward propa tion," which describes moving information from the input to the output direction, been the subject of the whole discussion up to this point.However, the neural netw did not achieve practical significance until 1986, when the Back Propagation mechan (BP) was employed [104,105].Back Propagation is a technique used to train neural works to adjust the weights and increase the model's generalisation to make it more able.The error rate of forward propagation is fed back through the NN layers.It analy compares, and evaluates the outcomes before going back oppositely from the output the inputs and adjusting the weights' values.This process is performed endlessly until weights are optimal.A reverse calculation of the weight values is carried out by find the difference between the predicted and real values, followed by partial derivation, Back Propagation is used to adjust the assumed weight values.After the output of e layer  ( ) is calculated, the result is passed through a function; the goal is to minimise cost function , and the result is then passed through the loss function, as describe Section (6).After reaching the expected value  ( ) (whether regression or classificati we find the delta error rate  ( ) by subtracting the predicted from actual values  ( follows: ( ) =  ( ) −  ( )   where  ( ) is not numbers but matrices with one column (vector) because several c are in each layer.The general equation for Back Propagation is as follows: ( ) = θ ( )  ( ) .  ( )   where  is the first derivative of the activation function, T is the transpose matrix of θ and ⋅ is a dot product, not a matrix product.A conventional neural network (CNN) kind of deep feed-forward neural network that is one of the most effective learning a rithms used in many applications, with significantly higher accuracy [106].A CNN is best algorithm for analysing image data [106] and for solving problems in several vi recognition tasks, such as identifying traffic signs, biological image segmentation, im classification [107], speech recognition, natural language processing, and video process [108].The power of a CNN is its ability to extract features from samples with diffe The most significant and often utilised component of neural networks is data propagation in both the forward and the reversed (or back) directions.This propagation is crucial for performing quick and efficient weight adjustments.The term "forward propagation", which describes moving information from the input to the output direction, has been the subject of the whole discussion up to this point.However, the neural network did not achieve practical significance until 1986, when the Back Propagation mechanism (BP) was employed [104,105].Back Propagation is a technique used to train neural networks to adjust the weights and increase the model's generalisation to make it more reliable.The error rate of forward propagation is fed back through the NN layers.It analyses, compares, and evaluates the outcomes before going back oppositely from the outputs to the inputs and adjusting the weights' values.This process is performed endlessly until the weights are optimal.A reverse calculation of the weight values is carried out by finding the difference between the predicted and real values, followed by partial derivation, and Back Propagation is used to adjust the assumed weight values.After the output of each layer a (L) is calculated, the result is passed through a function; the goal is to minimise the cost function J, and the result is then passed through the loss function, as described in Section 6.After reaching the expected value a (l) (whether regression or classification), we find the delta error rate δ (l) by subtracting the predicted from actual values y (t) as follows: where y (t) is not numbers but matrices with one column (vector) because several cells are in each layer.The general equation for Back Propagation is as follows: where f ′ is the first derivative of the activation function, T is the transpose matrix of θ (L) , and • is a dot product, not a matrix product.A conventional neural network (CNN) is a kind of deep feed-forward neural network that is one of the most effective learning algorithms used in many applications, with significantly higher accuracy [106].A CNN is the best algorithm for analysing image data [106] and for solving problems in several visual recognition tasks, such as identifying traffic signs, biological image segmentation, image classification [107], speech recognition, natural language processing, and video processing [108].The power of a CNN is its ability to extract features from samples with different requests at a fast speed [88] and handle high-dimensional inputs.A CNN offers two significant benefits over other ML algorithms [107]: (a) automated feature extraction from images utilising feature extraction without the requirement for feature engineering or data restoration, and (b) the algorithm complexity is significantly reduced by a network topology with local connections and weight sharing.The way the attention mechanism works allows it to extract the most important information from an image and store its contextual relationship to other image elements [106].The main layers of a CNN are the convolutional layer, pooling layer, and neural network layer.First, the convolutional layer applies several filters to input images to extract features (produce feature maps) and decrease their size [93].The convolutional layer's final output is obtained by merging these feature maps [109].Decreasing the number of network parameters and computations requires that the feature map size be reduced again in the pooling layer by selecting the essential features and obtaining the maximum values.The advantages of max pooling are that it decreases training time and controls overfitting [109].After repeating these layers several times, the output will enter into a neural network as flattened input values [110].These values go through FCs that reach the final output of the CNN.The activation function could be used in a CNN on convolutional, hidden, and output layers.

Model Solution Architecture and Design
Microsoft Azure VM was used to develop a Python code that draws eye diagram images and calculates SNR values via a multiprocessing technique.These images were saved in a folder on the VM, whereas their names and SNRs were stored as reference data in an SQL database.This study used SQLite DBMS to store information on 576 rows of eye patterns and related SNRs.The meta dataset used to generate eye diagrams consists of water type, channel model, pulse shape, pulse width, and the signal state (0 or 1), which is the value of position zero on eye diagrams.Another code was developed using the OOP paradigm to retrieve data from the database and train 13 CNN models via a training set to make decisions for testing images using the validation set.Then, the error between actual and predicted SNRs was calculated.The errors include the MAE and the RMSE for training and testing data.The BER values were extracted based on the original and predicted SNRs and the performance of the CNN models was measured.A schematic representation of the methodology for this study is shown in Figure 8. Consequently, the ML works as a decision unit in the optical receiver, which is the primary goal of this study.
Mathematics 2024, 12, x FOR PEER REVIEW 18 relationship to other image elements [106].The main layers of a CNN are the conv tional layer, pooling layer, and neural network layer.First, the convolutional layer ap several filters to input images to extract features (produce feature maps) and dec their size [93].The convolutional layer's final output is obtained by merging these fea maps [109].Decreasing the number of network parameters and computations req that the feature map size be reduced again in the pooling layer by selecting the esse features and obtaining the maximum values.The advantages of max pooling are th decreases training time and controls overfitting [109].After repeating these layers se times, the output will enter into a neural network as flattened input values [110].T values go through FCs that reach the final output of the CNN.The activation fun could be used in a CNN on convolutional, hidden, and output layers.

Model Solution Architecture and Design
Microsoft Azure VM was used to develop a Python code that draws eye diag images and calculates SNR values via a multiprocessing technique.These images saved in a folder on the VM, whereas their names and SNRs were stored as reference in an SQL database.This study used SQLite DBMS to store information on 576 row eye patterns and related SNRs.The meta dataset used to generate eye diagrams con of water type, channel model, pulse shape, pulse width, and the signal state (0 or 1), w is the value of position zero on eye diagrams.Another code was developed using the paradigm to retrieve data from the database and train 13 CNN models via a trainin to make decisions for testing images using the validation set.Then, the error betwee tual and predicted SNRs was calculated.The errors include the MAE and the RMS training and testing data.The BER values were extracted based on the original and dicted SNRs and the performance of the CNN models was measured.A schematic r sentation of the methodology for this study is shown in Figure 8. Consequently, the works as a decision unit in the optical receiver, which is the primary goal of this stud

Model Dataset
Eye diagram images were generated using a multiprocessing technique on Azure with the following components: Windows 11 Pro operating system, x64-based proce Intel Xeon Platinum 8171M CPU @ 2.60GHz 2.10 GHz, 32 GB RAM, 127 GB Premium LRS Storage, and eight virtual CPUs.The following attributes are required to generat eye diagrams and conclude the SNR: water type, the channel model, optical pulse sh and pulse width.Table 2 shows the details of these attributes.The corresponding values were stored in an SQL database.Some examples of the created eye pattern shown in Figure 9.

Model Dataset
Eye diagram images were generated using a multiprocessing technique on Azure VM with the following components: Windows 11 Pro operating system, x64-based processor, Intel Xeon Platinum 8171M CPU @ 2.60GHz 2.10 GHz, 32 GB RAM, 127 GB Premium SSD LRS Storage, and eight virtual CPUs.The following attributes are required to generate the eye diagrams and conclude the SNR: water type, the channel model, optical pulse shape, and pulse width.Table 2 shows the details of these attributes.The corresponding SNR values were stored in an SQL database.Some examples of the created eye patterns are shown in Figure 9.

CNN Algorithm
CNNs are widely used in optical communications and networking.Regarding UWOC, ref [111] proposed a constellation diagram recognition and evaluation method using deep learning (DL).ML is applied in networking systems to address tasks in the physical layers.These tasks include monitoring systems, assessing signal degradation effects, optimising launch power, controlling gain in optical amplifiers, and adapting modulation formats.It is also used in nonlinearity mitigation [15].The optical receiver can serve as an OPM in addition to its primary function of receiving data.A signal waveform is graphically represented in an eye diagram to locate optical signal impairments.The amplitude distribution over one or more-bit periods is depicted by overlapping the symbols.Eye diagrams are employed to evaluate the strength of high-speed digital signals [53].A data waveform is typically applied to the sampling oscilloscope's input to create them.Then, all conceivable one-zero combinations are overlapped on the instrument's display to cover three intervals [54].Pulses are spread out beyond the period of a single symbol because of the ISI, which results from temporal variations between light beams arriving at the receiver from multiple pathways.At data rates greater than 10 Mbps, ISI seriously impairs the system's performance.A clustering algorithm is used to identify anomaly attacks without being aware of the attacks beforehand.In ref. [112], a groundbreaking application of ML in optical network security has been reported.The findings showed that ANNs have a significant potential for detecting out-of-band jamming signals

CNN Algorithm
CNNs are widely used in optical communications and networking.Regarding UWOC, ref [111] proposed a constellation diagram recognition and evaluation method using deep learning (DL).ML is applied in networking systems to address tasks in the physical layers.These tasks include monitoring systems, assessing signal degradation effects, optimising launch power, controlling gain in optical amplifiers, and adapting modulation formats.It is also used in nonlinearity mitigation [15].The optical receiver can serve as an OPM in addition to its primary function of receiving data.A signal waveform is graphically represented in an eye diagram to locate optical signal impairments.The amplitude distribution over one or more-bit periods is depicted by overlapping the symbols.Eye diagrams are employed to evaluate the strength of high-speed digital signals [53].A data waveform is typically applied to the sampling oscilloscope's input to create them.Then, all conceivable one-zero combinations are overlapped on the instrument's display to cover three intervals [54].Pulses are spread out beyond the period of a single symbol because of the ISI, which results from temporal variations between light beams arriving at the receiver from multiple pathways.At data rates greater than 10 Mbps, ISI seriously impairs the system's performance.A clustering algorithm is used to identify anomaly attacks without being aware of the attacks beforehand.In ref. [112], a groundbreaking application of ML in optical network security has been reported.The findings showed that ANNs have a significant potential for detecting out-of-band jamming signals of various intensities with an average accuracy of 93%.Using TensorFlow and Keras 2.12.0, were used.
The CNN algorithm is used on eye diagram images to predict the values of SNR in different cases based on UWOC.To organise the inputs in a particular way or convert the relationship to a function that might predict an output, a CNN learns associations between the properties of the input data it receives.In this study, eye diagrams represent signals, and the result for SNR prediction and its magnitude is the type of impairment.This study's total number of samples is 576 eye diagram images, split into 404 and 172 for training and testing data, respectively.
The structure and implementation of the CNN in this study are as follows: 1.
The dimensions of the input eye diagram images are 2366 × 3125 pixels, with a resolution of 600 dpi.

2.
The network includes convolutional layers with a filter size of 10 and a stride of 1.
The filters range from 16 to 64, increasing by four at each step.There is no activation function applied.

3.
There are three non-overlapping max-pooling layers with a size and stride 3.

4.
Flattened values refer to the input values that will be fed into the NN.

5.
This study refers to the hidden layer as FC and uses the ReLU activation function to reduce the CNN calculations by setting negative values to zero.6.
The ultimate output of the CNN is the prediction of the signal-to-noise ratio (SNR) using a linear activation function, which is appropriate for regression tasks.
The Functional API model was used with the Adam optimiser, and the learning rate is equal to 1 × 10 −5 .Figure 10 shows the model structure; each circle represents convolutional and max pooling layers.The architecture contains five convolutional and max pooling layers.Each output of these layers comes with an input of the next layer; notice that the connection path between the flattened, hidden, and output layer is the weights (small random numbers at the beginning), and the weights affect the layer's output as seen in Equations ( 15) and (16).Figure 11 shows the CNN structure and its implementations. of various intensities with an average accuracy of 93%.Using TensorFlow and Keras 2.12.0, were used.
The CNN algorithm is used on eye diagram images to predict the values of SNR in different cases based on UWOC.To organise the inputs in a particular way or convert the relationship to a function that might predict an output, a CNN learns associations between the properties of the input data it receives.In this study, eye diagrams represent signals, and the result for SNR prediction and its magnitude is the type of impairment.This study's total number of samples is 576 eye diagram images, split into 404 and 172 for training and testing data, respectively.
The structure and implementation of the CNN in this study are as follows: 1.The dimensions of the input eye diagram images are 2366 × 3125 pixels, with a resolution of 600 dpi. 2. The network includes convolutional layers with a filter size of 10 and a stride of 1.
The filters range from 16 to 64, increasing by four at each step.There is no activation function applied.3.There are three non-overlapping max-pooling layers with a size and stride 3. 4. Flattened values refer to the input values that will be fed into the NN. 5.This study refers to the hidden layer as FC and uses the ReLU activation function to reduce the CNN calculations by setting negative values to zero.6.The ultimate output of the CNN is the prediction of the signal-to-noise ratio (SNR) using a linear activation function, which is appropriate for regression tasks.
The Functional API model was used with the Adam optimiser, and the learning rate is equal to 1 10 .Figure 10 shows the model structure; each circle represents convolutional and max pooling layers.The architecture contains five convolutional and max pooling layers.Each output of these layers comes with an input of the next layer; notice that the connection path between the flattened, hidden, and output layer is the weights (small random numbers at the beginning), and the weights affect the layer's output as seen in Equations ( 15) and (16).Figure 11 shows the CNN structure and its implementations.

SNR Prediction
This study constructed the CNN layers using the Keras library with the Functional API model.The loss error for the training sample, or the difference between the predicted and actual values, was also calculated.The cost function, or the cost error function, is the cumulative total of all errors for the training set.The cost function, which measures the model's accuracy, essentially refers to how far the predicted value is from the real data.The cost function's minimum value is determined throughout the CNN model learning phase.The task is to identify the model weights resulting in the cost function having a minimum value.Gradient descent optimisation (GD), a fundamental approach for CNN model optimisation, is employed to achieve this [113][114][115].The equation of GD is as follows: By substituting the partial derivative of ( ,  ) we obtain the following: where j = 0, 1, 2, …, n and m is the number of samples.This study used MAE as a loss function, while the RMSE was used as a metric function to measure the model's performance.MAE is the mean of absolute differences between predictions and real results where all individual deviations are even more critical, and the RMSE is measured as the average of the square root of the sum of squared differences between predictions and actual output.The mathematical formulas of them are as follows: where nimages is the variable that represents the number of eye diagram images in the testing sample.

SNR Prediction
This study constructed the CNN layers using the Keras library with the Functional API model.The loss error for the training sample, or the difference between the predicted and actual values, was also calculated.The cost function, or the cost error function, is the cumulative total of all errors for the training set.The cost function, which measures the model's accuracy, essentially refers to how far the predicted value is from the real data.The cost function's minimum value is determined throughout the CNN model learning phase.The task is to identify the model weights resulting in the cost function having a minimum value.Gradient descent optimisation (GD), a fundamental approach for CNN model optimisation, is employed to achieve this [113][114][115].The equation of GD is as follows: By substituting the partial derivative of J(θ 0 , θ i ) we obtain the following: where j = 0, 1, 2, . .., n and m is the number of samples.This study used MAE as a loss function, while the RMSE was used as a metric function to measure the model's performance.MAE is the mean of absolute differences between predictions and real results where all individual deviations are even more critical, and the RMSE is measured as the average of the square root of the sum of squared differences between predictions and actual output.The mathematical formulas of them are as follows: MAE y true , y pred =  (24) where n images is the variable that represents the number of eye diagram images in the testing sample.
This CNN programme retrieves the SNR (True) from the database and computes the predicted SNR value by processing the run-time-generated eye diagram images.The MAE is calculated by comparing SNR (True) and SNR (Predict).Moreover, the BER values are extracted from SNR, as shown in Figure 12.
Mathematics 2024, 12, x FOR PEER REVIEW 22 This CNN programme retrieves the SNR (True) from the database and compute predicted SNR value by processing the run-time-generated eye diagram images.The M is calculated by comparing SNR (True) and SNR (Predict).Moreover, the BER value extracted from SNR, as shown in Figure 12.

Results and Discussion
The proposed CNN models have successfully predicted SNR with high performa Figures 13-15 depict the learning curves, which show the training and validation re for both loss and RMSE and their ratio in relation to the validation values.We disca the scatter plot because it created point-overlapping distortion.The graphs in this st referred to as "standard hyperparameters," display the dropout rate, learning rate, number of epochs.The quantity of filters utilised in this study was modified, as indic in Tables 3 and 4 Table 5 provides a comprehensive summary of the models' performance via train idation loss and train/validation RMSE at the last epoch, as well as the number of train parameters and the loss and RMSE ratios.The equation for each one is as follows: The nearer to 1 the ratio is, the more fitting the model is so that the model can m a correct decision, and the more likely the predicted SNR will approach the actual va

Results and Discussion
The proposed CNN models have successfully predicted SNR with high performance.Figures 13-15 depict the learning curves, which show the training and validation results for both loss and RMSE and their ratio in relation to the validation values.We discarded the scatter plot because it created point-overlapping distortion.The graphs in this study, referred to as "standard hyperparameters," display the dropout rate, learning rate, and number of epochs.The quantity of filters utilised in this study was modified, as indicated in Tables 3 and 4 Table 5 provides a comprehensive summary of the models' performance via train/validation loss and train/validation RMSE at the last epoch, as well as the number of trainable parameters and the loss and RMSE ratios.The equation for each one is as follows: The nearer to 1 the ratio is, the more fitting the model is so that the model can make a correct decision, and the more likely the predicted SNR will approach the actual value.
The statistical analysis includes the results' minimum, maximum, and mean information.For example, the training time ranges between 8.33 and 10.99, whereas the range of predicting time ranges from 0.1732 to 0.2098.We observed no significant fluctuation in time, although the number of filters changed.The maximum difference between 1 and loss or RMSE ratios are 0.3381 and 0.4153, respectively, using 48 and 56 filters in CNN implementation.In contrast, the minimum difference between 1 and loss or RMSE ratios are 0.0297 and 0.0183, respectively, when using 20 filters.In addition, the average of the |1 − Loss Ratio| is 0.2107, and for |1 − RMSE Ratio| it is 0.2551, which is very close to zero, as shown in Table 3.    Table 4 displays the constant hyperparameters and their corresponding values used in this study, including the colour mode of the eye diagram images, the optimisation of the model, and other hyperparameters as shown in this table.The primary implementation motivation for the set of hyperparameters in Table 4 is to ensure that the CNN engine achieves its optimum model accuracy fitting and safely operates within a region away from the over-fitting and under-fitting boundaries.Moreover, this stable fitting region is broad enough for optimum processing time.
The Pearson correlation coefficients between the number of parameters and other results' information are displayed in Table 6.Loss, validation loss, RMSE, and validation RMSE have strong correlations, indicating that the cost function decreases while the CNN size increases.Therefore, the performance of the CNN model is enhanced by increasing its size.Table 7 represents the Pearson correlation coefficient between the number of filters and other results' information, like training and predicting times, loss, and RMSE, and the validation for both interpretations of these correlations.On the other hand, the graphs below, Figures 16-19 show the relationship between the number of filters used in the CNN models and their information.Figure 16 shows the weak correlation between the number of filters and training and predicting times, which indicates that the curve is almost constant.While it is expected that increasing the number of filters in a CNN would result in increased computations and, therefore, more time to complete them, using a highly capable VM mitigates the impact of increased computations on time, making it negligible.Their correlation coefficients range from medium to very strong concerning loss, validation loss, RMSE, and validation RMSE.When the number of filters increases, the capacity of the model (trainable parameters) also increases, which allows it to fit the training data better and improve its effectiveness (see Figure 17).In contrast, the loss and RMSE ratios are close to 1, as seen in Figure 18, which means the model makes good decisions.Their correlation coefficients range from medium to very strong concerning loss, validation loss, RMSE, and validation RMSE.When the number of filters increases, the capacity of the model (trainable parameters) also increases, which allows it to fit the training data better and improve its effectiveness (see Figure 17).In contrast, the loss and RMSE ratios are close to 1, as seen in Figure 18, which means the model makes good decisions.Their correlation coefficients range from medium to very strong concerning loss, validation loss, RMSE, and validation RMSE.When the number of filters increases, the capacity of the model (trainable parameters) also increases, which allows it to fit the training data better and improve its effectiveness (see Figure 17).In contrast, the loss and RMSE ratios are close to 1, as seen in Figure 18, which means the model makes good decisions.On the other hand, the positive upward curve displays a very strong linear direct correlation between the number of filters and the number of parameters; it forms a perfectly straight line (see Figure 19).The reason is that the total number of values inside all filters increases when the filters are increased.These are considered parameters, so the number of trainable parameters increases.
To assess the performance of the CNN models that work on the optical receiver, which can handle the ISI noise in UWOC, the relationship between the actual and predicted values of SNR and BER is drawn in Figure 20.This result illustrates the models' outcomes using a 0.45 dropout rate and a learning rate of 10 -5 with 28 filters, as shown in the figure below.
CNN models can predict correct results for harbour and coastal waters using Gaussian and Rectangular pulse shapes, with different pulse width ranges from 0.1 to 0.95, using DGF, WDGF, CEAPF, and BP channel models.The trend in the curves is similar to the identity function, represented by the red line (y=x), which means the actual values are close to the predicted ones in both SNR on the left and BER on the right.This shows that the CNN model can decide correctly in various situations involving various types of water, ISI noise, and water environment variations.On the other hand, the positive upward curve displays a very strong linear direct correlation between the number of filters and the number of parameters; it forms a perfectly straight line (see Figure 19).The reason is that the total number of values inside all filters increases when the filters are increased.These are considered parameters, so the number of trainable parameters increases.To assess the performance of the CNN models that work on the optical receiver, which can handle the ISI noise in UWOC, the relationship between the actual and predicted values of SNR and BER is drawn in Figure 20.This result illustrates the models' outcomes using a 0.45 dropout rate and a learning rate of 10 -5 with 28 filters, as shown in the figure below.CNN models can predict correct results for harbour and coastal waters using Gaussian and Rectangular optical pulse shapes, with different pulse width ranges from 0.1 to 0.95, using DGF, WDGF, CEAPF, and BP channel models.The trend in the curves is similar to the identity function, represented by the red line (=), which means the actual values are close to the predicted ones in both SNR on the left and BER on the right.This shows that the CNN model can decide correctly in various situations involving various types of water, ISI noise, and water environment variations.On the other hand, the positive upward curve displays a very strong linear direct correlation between the number of filters and the number of parameters; it forms a perfectly straight line (see Figure 19).The reason is that the total number of values inside all filters increases when the filters are increased.These are considered parameters, so the number of trainable parameters increases.To assess the performance of the CNN models that work on the optical receiver, which can handle the ISI noise in UWOC, the relationship between the actual and predicted values of SNR and BER is drawn in Figure 20.This result illustrates the models' outcomes using a 0.45 dropout rate and a learning rate of 10 -5 with 28 filters, as shown in the figure below.CNN models can predict correct results for harbour and coastal waters using Gaussian and Rectangular optical pulse shapes, with different pulse width ranges from 0.1 to 0.95, using DGF, WDGF, CEAPF, and BP channel models.The trend in the curves is similar to the identity function, represented by the red line (=), which means the actual values are close to the predicted ones in both SNR on the left and BER on the right.This shows that the CNN model can decide correctly in various situations involving various types of water, ISI noise, and water environment variations.The relation between SNR and BER, which represents the performance of the optical receiver, is drawn, as represented in Figure 21, using a 0.45 dropout rate and a 10 −5 learning rate with 28 filters.From the graphs, we can conclude that the suggested CNN models The relation between SNR and BER, which represents the performance of the optical receiver, is drawn, as represented in Figure 21, using a 0.45 dropout rate and a 10 −5 learning rate with 28 filters.From the graphs, we can conclude that the suggested CNN models perform well in making accurate decisions for various instances involving various types of waters, ISI noise, and underwater variations.
perform well in making accurate decisions for various instances involving various types of waters, ISI noise, and underwater environment variations.
Regarding the high BER values, it means small SNRs are included in our models.This is due to this studyʹs significant noise and channel fluctuations.Although the SNRs are caused by received pulses that pass through noisy channels, the models can predict SNRs accurately.

Conclusions
This study successfully demonstrated the implementation of a novel CNN-based decision unit strategy in an optical receiver of UWOC systems.The proposed CNN models are found to predict SNR effectively with high performance, with the train and validation losses and RMSE demonstrating convergence towards smaller values.The results show an inverse strong correlation between the number of parameters in the model and the cost function, suggesting that increasing the CNN model's size enhances its performance.Even in diverse water types with fluctuating noise levels and environment variability, employing a CNN model as a decision unit in an optical receiver enables efficient decision-making with a low-cost function.
Our innovative CNN tool's architecture and supporting mathematical formulations made it agnostic to the UWOC model and transmission modulation format.Hence, if any or all channel models in Table 1 are proven not to partially or fully satisfy the linear timeinvariant system (LTIS) condition requirements, replacing any or all of these models with ones that comply will not impact the CNN tool computational software algorithm.Still, it might require altering the hyperparameters of the CNN model platform structure shown in Table 4 to ensure optimum model accuracy fitting, but from a hardware perspective, we do not expect any necessary change to the hosting math processor of the DU.It is worth mentioning that LTIS requirements are translated in terms of channel path loss, mean delay, Root Mean-Square delay spread, and the constancy of the frequency bandwidth with the model temporal profile broadening with linkspans.

Future Studies
In future studies, we plan to elevate the effectiveness of our ML model for the UWOC system through a two-pronged strategy: dataset expansion and CNN refinement.The first cornerstone of our approach is the extension of our dataset.We aim to generate more eye diagram images, diversifying and enriching the data available for the ML model.This broader dataset will fortify the model's learning capabilities and enhance its predictive precision.Simultaneously, we propose a strategic refinement of our CNN's hyperparameters.We contemplate introducing two or three hidden layers into the network's architecture, which could amplify the model's ability to detect intricate features and, in turn, boost Regarding the high BER values, it means small SNRs are included in our models.This is due to this study's significant noise and channel fluctuations.Although the SNRs are caused by received pulses that pass through noisy channels, the models can predict SNRs accurately.

Conclusions
This study successfully demonstrated the implementation of a novel CNN-based decision unit strategy in an optical receiver of UWOC systems.The proposed CNN models are found to predict SNR effectively with high performance, with the train and validation losses and RMSE demonstrating convergence towards smaller values.The results show an inverse strong correlation between the number of parameters in the model and the cost function, suggesting that increasing the CNN model's size enhances its performance.Even in diverse water types with fluctuating noise levels and environment variability, employing a CNN model as a decision unit in an optical receiver enables efficient decision-making with a low-cost function.
Our innovative CNN tool's architecture and supporting mathematical formulations made it agnostic to the UWOC model and transmission modulation format.Hence, if any or all channel models in Table 1 are proven not to partially or fully satisfy the linear timeinvariant system (LTIS) condition requirements, replacing any or all of these models with ones that comply will not impact the CNN tool computational software algorithm.Still, it might require altering the hyperparameters of the CNN model platform structure shown in Table 4 to ensure optimum model accuracy fitting, but from a hardware perspective, we do not expect any necessary change to the hosting math processor of the DU.It is worth mentioning that LTIS requirements are translated in terms of channel path loss, mean delay, Root Mean-Square delay spread, and the constancy of the frequency bandwidth with the model temporal profile broadening with linkspans.

Future Studies
In future studies, we plan to elevate the effectiveness of our ML model for the UWOC system through a two-pronged strategy: dataset expansion and CNN refinement.The first cornerstone of our approach is the extension of our dataset.We aim to generate more eye diagram images, diversifying and enriching the data available for the ML model.This broader dataset will fortify the model's learning capabilities and enhance its predictive precision.Simultaneously, we propose a strategic refinement of our CNN's hyperparameters.We contemplate introducing two or three hidden layers into the network's architecture, which could amplify the model's ability to detect intricate features and, in turn, boost its accuracy.

Figure 1 .
Figure 1.ML algorithms in optical performance monitoring.

Figure 1 .
Figure 1.ML algorithms in optical performance monitoring.

Figure 4 .
Figure 4. Typical direct detection optical receiver model.

Figure 4 .
Figure 4. Typical direct detection optical receiver model.

•
DGFs with distances of 5.47 m and 45.45 m for harbour and coastal waters, respectively, and (20 • , 180 • ) field of view (FOV).• WDGFs and CEAPFs with distances of 10.93 m and 45.45 m for harbour and coastal waters, respectively, and 20 • FOV.• BP with 5 m and 10 m distances for harbour and coastal waters, respectively, and 180 • FOV.

Figure 7 .
Figure 7. Components and the functions of an artificial neuron.

Figure 7 .
Figure 7. Components and the functions of an artificial neuron.

Figure 8 .
Figure 8.A schematic representation of predicting SNRs with various numbers of filters.

Figure 8 .
Figure 8.A schematic representation of predicting SNRs with various numbers of filters.

Figure 9 .
Figure 9. Examples of eye diagram images.

Figure 11 .
Figure 11.The CNN architecture and implementation.

Figure 11 .
Figure 11.The CNN architecture and implementation.

Figure 12 .
Figure 12.Scheme of calculating the MAE of True and Predicted data.
. The training and validation curves show a gradual decrease in loss RMSE as the number of epochs increases, eventually converging to a similar value.loss and RMSE in training and validation curves gradually decrease with epochs, and become close to each other.When the number of filters (16, 20,24, and 28) in the C architecture increases, the training and validation loss and RMSE decrease, as seen in ure 13.The crucial metrics are the loss and RMSE ratios, approximately equal to 1. indicates that the models are highly accurate and efficient in predicting the actual values.Figures 14 and 15 demonstrate a decrease in both the loss and RMSE.Howev slight divergence was observed between the training and validation curves, indicati minimal gap between the loss and RMSE values for the training and validation datas

Figure 12 .
Figure 12.Scheme of calculating the MAE of True and Predicted data.
. The training and validation curves show a gradual decrease in loss and RMSE as the number of epochs increases, eventually converging to a similar value.The loss and RMSE in training and validation curves gradually decrease with epochs, and they become close to each other.When the number of filters (16, 20,24, and 28) in the CNN architecture increases, the training and validation loss and RMSE decrease, as seen in Figure 13.The crucial metrics are the loss and RMSE ratios, approximately equal to 1.This indicates that the models are highly accurate and efficient in predicting the actual SNR values.Figures 14 and 15 demonstrate a decrease in both the loss and RMSE.However, a slight divergence was observed between the training and validation curves, indicating a minimal gap between the loss and RMSE values for the training and validation datasets.

Figure 16 .
Figure 16.Number of filters vs training and predicting time for all CNN models.

Figure 17 .
Figure 17.Number of filters vs training and validation for both loss and RMSE for all CNN models.

Figure 16 .
Figure 16.Number of filters vs. training and predicting time for all CNN models.

Figure 16 .
Figure 16.Number of filters vs training and predicting time for all CNN models.

Figure 17 .
Figure 17.Number of filters vs training and validation for both loss and RMSE for all CNN models.

Figure 18 .
Figure 18.Number of filters vs loss and RMSE ratios for all CNN models.

Figure 17 .
Figure 17.Number of filters vs. training and validation for both loss and RMSE for all CNN models.

Figure 17 .
Figure 17.Number of filters vs training and validation for both loss and RMSE for all CNN models.

Figure 18 .
Figure 18.Number of filters vs loss and RMSE ratios for all CNN models.Figure 18. Number of filters vs. loss and RMSE ratios for all CNN models.

Figure 18 .
Figure 18.Number of filters vs loss and RMSE ratios for all CNN models.Figure 18. Number of filters vs. loss and RMSE ratios for all CNN models.

Figure 19 .
Figure 19.Number of filters vs number of trainable parameters for all CNN models.

Figure 19 .
Figure 19.Number of filters vs. number of trainable parameters for all CNN models.

Figure 19 .
Figure 19.Number of filters vs number of trainable parameters for all CNN models.

Figure 20 .
Figure 20.Performance of the CNN models, the true versus the predicted SNR (left) and BER (right) values.Various channel models are employed to simulate the behaviour of water in harbours (represented by the colour blue) and coastal areas (represented by the colour green) for different pulse widths.

Figure 20 .
Figure 20.Performance of the CNN models, the true versus the predicted SNR (left) and BER (right) values.Various channel models are employed to simulate the behaviour of water in harbours (represented by the colour blue) and coastal areas (represented by the colour green) for different pulse widths.

Figure 21 .
Figure 21.SNR vs BER for harbour water (left) and coastal water (right).The true (red) and predicted (blue) values are for different pulse widths using different channel models.

Figure 21 .
Figure 21.SNR vs. BER for harbour water (left) and coastal water (right).The true (red) and predicted (blue) values are for different pulse widths using different channel models.

Table 1 .
List of models' channel impulse response functions.

Table 2 .
The required attributes for generating eye diagram images and calculating SNR.

Table 2 .
The required attributes for generating eye diagram images and calculating SNR.

Table 3 .
Statistical of results information.

Table 4 .
Standard for CNN model platform structure.

Table 5 .
Performance summary of the dataset's model regarding training and validation for loss and RMSE.

Table 6 .
Pearson correlation coefficients between a number of parameters and results' information and their interpretation.

Table 7 .
Pearson correlation coefficients a number of filters and results' information and their interpretation.