Cross-Bands Information Transfer to Offset Ambiguities and Atmospheric Phenomena for Multispectral Data Visualization

—Visualization of multispectral images through band selection methods determines an information loss that in utmost cases proves to be critical for the adequate understanding of the represented scene. The R–G–B representation obtained by mapping the visual bands to the R, G, and B channels is highly used due to its great resemblance with the natural color one and aspects perceivable by the human eye. However, despite the similarity in terms of color code, ambiguities between classes such as water and vegetation or atmospheric phenomena like fog, clouds, and smokethathavebeenpenetratedbyotherbands,remainvisibleand hindertheprocessofvisualizationoftheEarthsurface.Thisarticle presentsasetofﬁvedifferentmethodstooffsettheeffectscausedbyambiguities,fog,lightclouds,andsmokebytransferringrelevant informationbetweenbandsinordertovisuallyreconstitutethosepartsoftheimageaffectedbyatmosphericphenomena.Thegeneral conceptsharedbythesemethodsimpliesastackedautoencoderthatsuccessfullyencompassestheinformationfromallspectral bandsintoalatentrepresentationusedforvisualization.Eachproposedmethodisdeﬁnedbydifferentcombinationofinputand errorfunctionformula.Spectralandpolarcoordinatesfeaturesrepresentthepossibleoptionsfortheinput,whileformulasbased onmeansquarederrororangularspectraldistancesdeterminethepotentialchoicesintermsoferrorfunctiondeﬁnition.Theproperty ofangularspectraldistanceandpolarcoordinatestransformation toobtainilluminantinvariantfeaturesdeterminedtheiruseinthreeoutofﬁvemethods.Weevaluatethemethodsthroughspectral signaturegraphicalcomparisonandvisualcomparisonrelatedtotheR–G–Brepresentation.Weconductexperimentsonmultiple Sentinel2fullimages.

Mihai Datcu is with the Research Center for Spatial Information, University Politehnica of Bucharest, 061071 Bucharest, Romania, and also with the Remote Sensing Technology Institute, German Aerospace Center, 82234 Oberpfaffenhofen, Germany (e-mail: mihai.datcu@dlr.de).
Digital Object Identifier 10.1109/JSTARS.2021.3123120 a multidimensional product defined by the number of spectral bands of the sensor and the ground footprint.Spectral bands represent radiance measurements of data acquired from different regions of the electromagnetic spectrum.For example, Sentinel 2 satellite senses 13 wavelengths intervals from the visual, NIR, and SWIR parts of the spectrum, resulting in 13 spectral bands.
As the display of an image is limited to three bands, the visualization of a multispectral EO product is usually performed by mapping the three bands of the visual part of the spectrum to R-G-B channels.However, very often the captured images contain different phenomena manifested in the form of clouds, fog, or smoke.The wavelength of the bands in the visual spectrum is small, so they fail to penetrate the haziness to get to the sensor and the information about terrestrial aspect is lost.The visualization shows high pixel values generated by the strong reflection of these atmospheric phenomena.In most of the cases, the information about surface could be obtained from the SWIR bands because due to their longer wavelength they could penetrate and be acquired by the sensor.
In addition to atmospheric phenomena, there is another aspect that could interfere with a proper visualization of the scene: in the absence of information contained by other spectral bands than those mapped to the R-G-B channels, ambiguities may occur between apparently similar or dissimilar regions, or some details about the scene may be absent.For instance, a lake covered with reed may reflect similarly in the visible portion of the spectrum as the meadow nearby.However, the lower spectral response of water in NIR can help the user differentiate better between the two areas, if information in the NIR bands is considered for visualization.
Remote sensing images visualization is of great importance for a large number of applications including disaster management, deforestation, or climate change management.Thus, the need for improvement is extremely necessary, a good example being given by the scenes in Fig. 1, where the enhanced visualizations emphasize the amount of information lost by transposing only three bands for analysis.The first line of compared scenes highlights an example of usually encountered ambiguity, vegetation, and water bodies, which with an improved visualization, disappears.The second line represents a scene of an ongoing fire where details about surface are hidden in the R-G-B representation, but would be very useful for image analysis.
Taking all these aspects into consideration, we propose a set of methods to improve visual analysis through embedding all the information contained by all the spectral bands into a latent representation of three values using a stacked autoencoder (SAE).These values are mapped to the R-G-B channels for visualization.Our principal objective is to improve visualization by reducing the ambiguities and obstructions generated by the lack of information in the image displayed compared to all spectral bands in the multispectral product.In order to reduce obstruction, we apply transformations that generate illuminant invariant features.We aim to accomplish these objectives without affecting the spatial resolution.
Achieving the aforementioned objectives involves the use of versatile datasets to demonstrate them.Thus, the following four main scenarios are considered: clear, smoky, foggy, and cloudy images.

II. PROPOSED CONCEPT
We based our concept on the following premises: 1) NIR, SWIR, and the other spectral bands not used for the R-G-B visualization may contain information useful for ambiguities and visual obstructions reduction.2) Angular spectral distance and polar coordinates transformation obtain illuminant invariant features and are suitable to reduce image contamination.
3) The autoencoder is a powerful network that successfully learns in an unsupervised way to embed reflectance information into a lower dimension.Given these premises, we propose a set of methods based on a general concept: the use of an autoencoder to embed information from all bands on only three bands.These resulting bands are subsequently mapped on the R-G-B channels and false color visualization is thus obtained.The methods represent a variation of the actors involved.The motivation behind this diversity is given by the different utility of each of the proposed approaches.
Fig. 2 illustrates the general concept underlying the five proposed methods.An autoencoder is the core and the common part in defining these methods because it performs the operation of embedding information from several bands to a threedimensional latent representation.This obtained representation is then used for mapping on RGB channels for visualization.The actors that differentiate the five methods are the way the data enters the network, the input, and the way the error function is computed.The data can enter the network in the form of radiant values or polar coordinates transformed values.Regarding the error function, we start from a general computation that evaluates the difference between input and output, and change it from one method to another by including additional evaluations or by transforming the compared values into polar coordinates or angles.The use of polar coordinates and angular spectral distance is due to the fact that they obtain illuminant invariant features.

III. RELATED WORK
Until now, many scientific fields have addressed the visual analysis problematic [1]- [4].Being suitable for improved visual perception, HSV is highly used to enhance the visualization by transforming images data into hue, saturation, and value color model [5], [6].
Chang et al. [7] define the MPEG-7 as a suite of image descriptors for multimedia processing that has a great potential.
Multimedia image visualization enhancement has been and still is a much debated issue, a number of research work being invested in this matter [8]- [10], with good results.Nevertheless, applying them for multispectral EO data visualization is not possible without significant adjustments.Compared to multimedia data, multispectral images have more than three bands and they do not always represent information that is perceivable by the human eye (i.e., NIR and SWIR bands).These differences generate distinct approaches.
As Jacobson et al. [11] and Polder et al. [12] highlight, a commonly used method to visualize high-dimensional data is by using a three channel color representation, R-G-B to obtain a quick overview of the scene.Several band selection methods have been proposed over the years in order to provide an enhanced solution to visualization [13]- [15].
Displaying multispectral data in any triplet of bands mapped to R-G-B channels generates incomplete information due to the fact that all the information contained by the other unused bands is lost.However, Wang et al. [17] showed in their work that visualization techniques can highlight the utmost relevant information, enhancing the visual perception.Thus, an enhanced method for visualization is needed.
Dey et al. [16] mention in their book that among the five main visualization problems are noise and loss.Noise refers to the ambiguity determined by the fact that objects are relative Fig. 2. General architecture of the proposed methods for visualization of multispectral images.This article proposes five different neural network based methods to improve visualization of multispectral remote sensing images.All methods share the same general neural network: a stacked autoencoder which compresses the input into a hidden representation, subsequently used for the result visualization.The difference between these approaches is represented by objective to be met and distinctive combination of involved actors: input and error function.The main objectives consist of ambiguities and contamination reduction.Spectral and polar features are the possible options for the input and the error function encounters variations in terms of augmentation (method II), transformation applied (methods III, IV, and V).
to each other, while loss refers to contamination generated by decreased visibility.
Sovdat et al. [18] define the natural color product and propose two approaches to compute it.The approaches are not restricted to Sentinel 2, but they could be used also for other optical sensors.The methods efficacy depends on the imaging instruments properties.
Remote sensing images represent a great source of information for monitoring different aspects of the Earth surface.A multitude of neural network based methods has been presented and demonstrated to be very useful [19]- [21].Also, convolutional neural networks have proven their applicability regarding spectral images analysis [22]- [24] through classification.Taking into consideration the spatial mediation obtained through the convolutional transformation, we will avoid using this type of network in accomplishing our objectives.
Different methods to eliminate or detect clouds from Sentinel-2 images have been developed.Luotamo et al. [25] propose a method that uses CNN, Singh et al. [26] use a cyclic GAN to accomplish the removal, and Homem Antunes et al. [27] use 6S model to perform atmospheric correction and cirrus cloud removal.
One different approach of image inspection consists of using neural networks for feature extraction, colorization, or classification.Autoencoders are neural networks that have encountered a lot of appreciation over the years due to their capability of embedding data into a lower representation in an unsupervised way [29]- [31].These works purpose is mainly to obtain the utmost classification of the obtained latent representations.Usually, autoencoders realize a relevant transfer of information between dimensions to be reduced.Vincenzi et al. [28] use autoencoders to perform a colorization task and compare results with R-G-B representation.Applications of autoencoders to improve visualization have been presented by Neagoe et al. [33], [34].
The visualization of multispectral EO images topic is still remaining unresolved.Current methods address specific issues, being limited to one particular atmospheric phenomena or type of contamination that affect the visual analysis process and have no ability to generalize for multiple situations.
After analyzing the existent methods, their data transformation approach, and the purpose of their application, we conclude that, in order to improve visualization for visual inspection of a human operator, we need to use a method that preserves spatial information and resolution.We have to use a method capable of learning from the spectral space and not from the spatial one.The autoencoder learns to embed information in an unsupervised way and does not perform any spatial mediation, thus proving to be a suitable solution to our problem.

IV. MULTISPECTRAL DATA REPRESENTATION METHODS
Multispectral remote sensing images are conventionally represented as a plot of the image features into a multispectral vector space having the space dimension equal to the number of spectral components.The distance between two vectors, A and B, in this space, may be computed using the Euclidean distance, d, or using the angular distance, θ.While Euclidean distance is computed through length measurement of the segment that connects the endpoints of the involved vectors, the spectral angular distance is computed by angle measurement, θ, between the two vectors: where N represents the number of spectral features and a i and b i are the features values for each band, i = 1 . . .N .Sohn et al. [36] relied on spectral angular distance to measure distances in feature space and perform image clustering and classification.The invariance of this distance to linearly scaled variations has been proved.
In order to obtain an improved analysis over multispectral remote sensing images with large cloud coverage or shadows Okamura et al. [35] developed an illuminant invariant feature descriptor based on polar coordinate transformation of the reflectance values.Also, Georgescu et al. [37] proposed the use of polar transformation to generate a polar feature space which is used for the computation of the MPEG-7 scalable color descriptor.These works have been encouraged by the property of the spectral angular distance to be invariant to the linearly scaled variations along with the preservation of spectral signatures.
Polar coordinate transformation represents a computation with the help of which radiances values are transformed into angles θ and distance ρ.Having a product with N spectral bands, the obtained result consists of N-1 angles and one distance, making the dimensionality of the two objects equal.The mathematical equations used to transform radiances values, x ∈ {x 1 , x 2 , . . ., x N }, into polar coordinates are as follows: (3) As the utility of the polar coordinates transformation has been demonstrated through previous works, we include this conversion in some of our visualization enhancing methods to emphasize the obtained results.

V. MULTISPECTRAL IMAGE COMPRESSION IN THREE BANDS
During the past few years, multiple domains have been using different solutions based on deep learning networks for diverse purposes, proving their efficacy with great success.In remote sensing, a multitude of such methods have been used for different purposes: dimensional reduction, classification, visualization, etc.In this article, we use a neural network to develop our visualization improvement methods.
Deep learning neural networks perform a mapping process between a certain set of inputs and a certain set of outputs from the training data.The model of a neural network is defined as a set of weights used to make predictions.The weights cannot be computed perfectly, thus the learning process can be perceived as an optimization problem.Usually, the stochastic gradient descent optimization algorithm is the chosen option to update the weights through backpropagation.The algorithm seeks to adjust weights trying to reduce error for the next evaluation of predictions by navigating down the gradient of error.
Autoencoders are neural networks that learn in an unsupervised way to reconstruct an input, obtaining a latent representation of smaller dimension inside the network, at the bottleneck layer.The autoencoders include two main structures: an encoder and a decoder.The input of the encoder, X, represents the object to be compressed, and its output is a latent embedded representation of the input, H.The output of the encoder represents the input to the decoder which has the main assignment to reconstruct X using H.The result of the decoder, Y, is a representation that must be as similar as possible to X, having the same dimensionality.
The encoding function, ϕ, maps input to a hidden representation by using Also, the decoder performs a mapping, this time transposing the hidden representation to a reconstruction of input: During training, the network uses an optimization process that needs an error function to evaluate loss and update the model weights.The error function of an autoencoder can be determined by the general representation: The function error can be defined and modified such that it evaluates the interest aspects as regards to the network purpose.Also, the objects implied may be changed, transformed or different than X and Y.
SAE represents an enlarged version of a basic autoencoder.The encoding and decoding operations are performed by sequences of layers and the symmetry relative to the bottleneck layer is preserved.

VI. ENHANCED VISUALIZATION PROPOSED METHODS
This article proposes five different visualization improvement methods for multispectral remote sensing images based on an SAE neural network.
The proposed methods represent a diversity of combinations of the previously defined actors, namely: X and error function.
Regarding the error, we derive the general defined function and augment it with different variations of evaluation in order to accomplish the desired objective of the neural network.
As respects to the input, we use two different features: spectral and polar.Spectral features are represented by the reflectance values acquired at different wavelengths, mapped to the spectral bands of the product.Polar features instead, are obtained through polar coordinate transformation of the spectral values.
The naming algorithm comprises the involved actors.The following subsections present the objectives, mathematical definitions, and utility of the five methods.

A. Spectral Input-Spectral Error (SI-SE)
The purpose of this method is to accomplish the first objective of this article by revealing the hidden details from an apparently accurately displayed scene.In this context, "apparently accurately displayed" means that although the scene is not contaminated by clouds or other atmospheric phenomena, it could hide important details due to the fact that only the reflectance values from the visible part of the spectrum are displayed.
The combination of actors as respect to this method consists of an input of reflectance values and a loss evaluation using the error function defined by the following: MSE stands for mean squared error and is very often used to compute the error function in neural network models.This error function evaluates the network capability to reconstruct the input from the hidden representation.The amount of information from the input contained in this dimensional reduced representation should be as large as possible so that the error to get as small as possible.
Apparently similar regions could represent different things and apparently different regions could represent the same thing, as shown in Fig. 3. Water bodies and vegetation generate confusion between them because when watching to the R-G-B representation they have the same color.But, the latent result obtained by this method displayed for visualization reveals the differences.The information contained by the bands which are not involved into visualization representation may be different, this observation being also depicted from Fig. 4, where the spectral signatures show different patterns among the spectral bands.The bands from the visible part of the spectrum share the same pattern while, as close as they get to the NIR and SWIR part of the spectrum, the pattern changes.The latent representation signatures demonstrate the embedding ability of the network, capturing all different patterns from all spectral signatures.

B. Spectral Input-Spectral Error With Color Correction (SI-SECC)
The second method comes as an improvement to the first one, meaning that besides fulfilling the first objective of eliminating the ambiguities, this method aims to keep the visualization as close as possible to that obtained using the bands from the visual part of the spectrum mapped on the R-G-B channels.The actors involved are the spectral values and the loss computed over an augmented error function of the previous defined one, (12).Augmentation consists in adding an evaluation in terms of color difference between the latent values, H, obtained by the encoder and the values of the bands from the visual part of the spectrum, RGB X : Color difference is computed using the Euclidean distance: Eq. ( 14) shown at the bottom of next page.
Fig. 5 shows the latent signatures pattern preservation along with the tendency of being more similar with the visual spectral signatures pattern.Although the spectral signatures SWIR pattern from the input seem to not be observable, the third band of the latent result shows some little peaks which seem to be related with it.The comparison between R-G-B and SI-SECC from Fig. 6 emphasizes the capability of the autoencoder to embed information from NIR and SWIR bands because the smaller water bodies from the left of the sea are not visible in

C. Spectral Input-Polar Coordinates Error (SI-PcE)
This method has been developed in order to satisfy both objectives of this article, reduction of ambiguities and visual contamination caused by clouds, smoke, or fog.Illuminant invariant features were successfully used for tasks like dehazing, therefore we chose to demonstrate their superiority when integrated within a deep learning method.
The input is represented by the spectral features and the loss function implies an error evaluation which computes the MSE between the transformation of X to polar coordinates, polar X ,   ( Watching the graphical representation of the spectral and latent signatures from Fig. 7, it can be observed that the signatures patterns of the input are preserved and embedded into a three bands combination, each different pattern from the input being dominant over one band in the latent.This fact proves the capability of compression of the proposed method and also the preservation of spectral signatures patterns, resulting in a more complete visualization of the scene observed.Fig. 8 highlights the better visualization result obtained with SI-PcE.Although the left side of the figure shows a scene covered by clouds, the right one succeeds to disclose the Earth surface.Regarding computation time and complexity, this method stands as one of the most consuming because at each epoch, before evaluating the loss, the algorithm has to perform a polar transformation over Y.

D. Polar Coordinates Input-Spectral Error (PcI-SE)
This method aims to accomplish both of our objectives in terms of visualization enhancement of multispectral remote sensing images.Also, an additional objective of this method would be to reduce as much as possible shadows.This supplementary goal comes from the property of polar coordinates to be illuminant invariant.
By using these polar transformed features as input to the network and evaluating loss over an error computed using spectral features, we address the goal of combining the preservation of the spectral signatures and illumination invariance.This method represents the inverse, in terms of actors implied, of the previous method, so the error function is computed by error = MSE (spectral X , spectral Y ) .
(16)  Fig. 9 denotes the pattern merge effect that takes place in the latent representation signature, showing that each band resulted is influenced by all input spectral features patterns.The preservation of spectral signatures is highly observable meaning that the goals are successfully achieved.Fig. 10 represents a scene of ongoing fire that has an emphasized visualization using our proposed method.The smoke from the R-G-B representation is predominantly removed, making smoked areas, remained vegetation, and also the fire borderline visible.

E. Spectral Input-Angular Error (SI-AE)
Visible enhancement by means of ambiguities and atmospheric phenomena reduction are the main objectives of this method.Having as auxiliary purpose to verify the angular distance property of being invariant to linearly scaled variations of spectral values, we developed a method that includes this distance in the error function of the neural network: The angular distance involves the input to the network, in this case, spectral features and the decoder output.Latent representation signatures show a mixed preservation of the spectral ones, each band from the latent space consisting of multiple patterns from the spectral space, Fig. 11.Thus, the objective of embedding the information from all spectral bands is achieved.Ambiguities are eliminated from the visualization, as Fig. 12 shows.Also, smoke and shadows present in the R-G-B representation are reduced, demonstrating the illuminant invariance character of angular distance.This representation proves the utility and capability of information transfer obtained by the autoencoder by revealing not visible information from the R-G-B representation.

VII. EXPERIMENTAL RESULTS AND DEMONSTRATION
To demonstrate how our proposed methods sharpen the data visualization, we used Sentinel 2 images acquired at different moments of time.The footprint of the analyzed scenes covers multiple regions of the world.We present in Table I the products used, city, and country.All the scenes were resampled before use so that all bands shall have 10 m resolution.Thus, we performed an upsampling on bands with 20 and 60 m resolutions by setting each output pixel to the nearest input pixel value.The resulting products have the same dimension, 10980 × 10980 × 12.
The processing level 2 of Sentinel 2 sensor does not contain band 10, so the resulting product contains only 12 bands.The reason behind the decision of removing this band is the fact that is does not contain surface information.
The SAE architecture used by all methods contains four autoencoders, as shown in Fig. 13.The encoder is defined according to a topology that decreases from 12 inputs to 3 according to the following pattern "12-8-6-3," and the decoder is defined by an ascending topology following the same pattern.We used Elu activation and Adam as optimizer.The hidden representation consists of three values for visualization purposes, the other values being chosen experimentally.The training dataset for all methods consisted of a four concatenated subsets of clear, smoky, foggy, and cloudy images, and had a dimensionality of 10980 × 10980 × 12. Before starting the training process, we performed a reshaping operation by transforming M×N×B dimension into (M * N)×B, resulting a matrix, T, with a pixel definition per line, each line consisting reflectance values from the spectral bands.M stands for the number of rows, N for the number of columns, and B for the number of spectral bands.Also, we performed a min-max scaling operation over x i,j represents the radiance value from the ith line and jth band, i = 1 . . .M * N , j = 1 . . .B .min and max are computed over the entire matrix.Parameters like batch size, learning rate, epochs number change from method to method, as illustrated in Table II.The table contains different parameter setups which were implemented and tested, the thickened rows representing the final setup.The configurations were adjusted until the loss would not decrease in the final five to seven epochs and its value would be the smaller.
We implemented the experimental code using Python 3.8.5 and TensorFlow 2.3.0 for GPU.To reduce model training time, we used a distributed system and parallelized computation across 8 PCIe-connected K80 GPUs.
All the visualization operations were performed using Sentinel Application Platform.
In this article, we aim to demonstrate that the visualization of multispectral remote sensing images can be emphasized using the whole information from all spectral bands embedded in a three latent band result.Our main objectives are to reduce ambiguities that may occur and also reduce the atmospheric phenomena that may obstruct the analysis of terrain surface.
We propose a set of visualization methods based on the same general concept of SAEs which are differentiated by the combination of actors involved in the learning process.Two of them are oriented toward the fulfillment of the first objective, the one to reduce the ambiguities, and the other three pursue the achievement of both objectives.With the help of the experimental results, we highlight the benefit of visualizing a multispectral image using one of the methods.We do not perform an evaluation in terms of general performances of each method and do not classify them because as it is observed from the experimental results, depending on the scenario, it can be proved that any of the methods can be considered the most suitable depending on what the user searches.
The results are grouped into four scenarios: clear images, foggy images, cloudy images, and smoky images and discussed in the next subsections.

A. Clear Images
Visualization of multispectral remote sensing images represents the process of analysis and understanding of the scene so that after identifying the points of interest the user can make a decision according to its purpose.
Usually, when visualizing a scene, the user has the desire to understand the information about the analyzed surface, but this is not always possible when the method of image representation involves the use of a combination of three bands mapped on R-G-B channels.Even if the scene does not contain atmospheric phenomena that prevent the information about the Earth's surface, there can be many hidden details in the spectral bands not included in the production of the visualization.
Fig. 14 demonstrates the benefit of visualizing a representation that contains the information from all spectral bands, making the differences between apparently similar regions to be observable.All the five methods accomplish this discrimination obtaining a contrast between dissimilar regions.The main reason behind the achievement of this goal by all five methods is that they include both vegetation information (bands 5, 6, and 7) and information from bands acquired at wavelengths in the NIR or SWIR range.
Both scenes present subsets containing examples of semantic class separation between vegetation and water.For example, the first subset of the scene in Fig. 13 includes a water body in the right centered part of the image, which in R-G-B representation is not visible.The other five representations emphasize a difference depicted from the strong contrast between the water body and the surrounding regions.
The third subset of the second scene shows a more crowded image, where the improved representations highlight the water body from the left centered part.Although the first two methods do not encompass any transformation to polar coordinates or angles, their results show an improved visualization.The second one even realizes a color resemblance with the R-G-B representation.
As respects to the scenario of clear images, all methods prove their visualization gain and demonstrate well preserved input information while transferring relevant knowledge between bands.

B. Smoky Images
Very often, in natural disaster monitoring applications, an analysis of the Earth's surface using satellite images is necessary.Optical sensors acquire light reflected from the Earth's surface, but in the case of disasters such as fires, the recording values on the wavelengths that cannot penetrate through the smoke generated by the burning of various terrestrial objects cause image obstruction.The bands in the visual part of spectrum are often the ones affected by this process, and visualizing such a scene using only these bands could cause a loss of information about what is under the smoke.Therefore, enclosing the information present in the NIR and SWIR bands can improve visualization and bring additional information about the Earth's surface.However, it is possible that even SWIR bands to not be able to penetrate through dense smoke, so the information about the Earth's surface could never be retrieved from the image.
When information about what is under the clouds of smoke is available on at least one of the bands, a view that includes this information is very beneficial in the analysis process.Fig. 15 shows two scenes of ongoing fires and demonstrates the advantage brought by the visualization proposed methods for investigation purposes.The subsets of the first scene demonstrate that the methods extract available information about the appearance of the Earth's surface, and the subsets of the second scene show situations where information about certain regions is not available on any of the bands, so visualization of those areas it is not possible.SI-SE and SI-SECC methods demonstrate improved results in terms of visualization compared to R-G-B representation, but the methods that include transformation into polar coordinates or angular distance, due to the property of illumination invariance, have better results.We mention that SI-SE and SI-SECC methods did not have as main objective to retrieve the information under the smoke, but their results proves that they are also suitable for this task.Due to small dimensions of the results, the improvement is not visible in the first two subsets of the second scene, but at a higher resolution, hidden details about the Earth surface are revealed.For this particular two subsets, the fifth method, SI-AE, disclose most of the indistinctly information compared with the R-G-B representation.

C. Foggy Images
Fog is one of the atmospheric phenomena that can intervene in the visualization and analysis of the Earth's surface.The phenomenon of obstruction of visibility is similar to that encountered in smoke scenes, with the mention that due to the size of the water droplets that make up the fog, their penetration by the wavelengths available on optical sensors, like Sentinel 2, is often impossible.Retrieving information about the terrestrial aspect becomes impossible in case of a dense fog.
The subsets of first scene in Fig. 15 show two different cases of retrieving the information contained in all spectral bands, so that the first two lines demonstrate the improvement of the contrast and regions distinction, even if the fog is visible.The second line illustrates a reduction of the surface covered by fog besides improving the contrast.

D. Cloudy Images
Although clouds are of several types, dense or less dense, at a lower or higher altitude, there are certain situations in which longer wavelengths manage to penetrate them.Most of the time, not even the wavelengths in the SWIR range of the spectrum manage to pass through to capture information about the terrestrial aspect.
The last two images of Fig. 15 illustrate two scenes covered by clouds, the first of which highlights the cases in which the clouds may be less transparent, so that it can be seen below them only in certain parts, where they are less dense.The second represents an example of a semi-transparent cloud, which allows the observation of scene details.
All of the subsets of the first cloudy scene highlight a good information retrieval from the initial spectral bands.The most impressive result is obtained by the third subset if the first cloudy scene, where due to high cloud coverage, the Earth surface is not visible in the R-G-B representation, while all proposed methods reveal that under the cloud was a lake.But the other two are also remarkable, the hidden details being also emphasized.
The second scene subsets show that even through a more dense cloud some details are still perceivable using representations that make use of the information from the SWIR bands.The difference between visualization of the visible bands and those obtained with the proposed methods are fewer, but they exist.

VIII. CONCLUSION
This article presents a series of methods to improve the visualization of multispectral remote sensing images following two main objectives: reducing ambiguity and reducing visual obstruction caused by atmospheric phenomena such as fog, smoke, or clouds.We have developed a variety of methods to obtain a complete visualization of the multispectral image.To this aim, given that certain wavelengths manage to penetrate fog or clouds, we computed the spectral angular distance and the data transformation into polar coordinates in order to achieve illuminant invariance.The autoencoder successfully incorporates this information.
Experimental results have demonstrated the capability of the proposed methods to extract complementary information from the spectral bands that are not typically used for visualization.Also, through spectral and latent signatures evaluation we demonstrated the autoencoder capability of pattern preservation and relevant information transfer between spectral bands.
Details that are not visible in the R-G-B representation are emphasized in the results of the proposed methods, a very representative example being illustrated in Fig. 17.Two entire Sentinel 2 scenes covered by smoke and, respectively, clouds are compared.The visual improvement is obvious as the phenomena effects are reduced.Also, these results prove the effectiveness of the proposed methods over large datasets.
The visual improvement using the set of proposed methods shows a very clear separation between semantic classes, like water and vegetation.The first two methods, SI-SE and SI-SECC, do not guarantee results regarding the second objective of this article, but from the experimental results they show an enhanced visualization relating to the RGB representation.
The proposed methods are not mutually exclusive as they could represent different versions from which the user can choose the one that fits best his purpose.Also, these visualization methods could be an alternative to the quick looks used to choose products on the Copernicus Open Access Hub platform.It would also be useful in active learning applications.

M
ULTISPECTRAL Earth Observation (EO) images are records of sunlight reflected by Earth surface made using optical sensors.The result obtained from this process consists of Manuscript received June 3, 2021; revised September 9, 2021; accepted October 15, 2021.Date of publication October 27, 2021; date of current version November 15, 2021.This work was supported by the Romanian Ministry of Education and Research, CNCS -UEFISCDI, project number PN-III-P4-ID-PCE-2020-2120, within PNCDI III.(Corresponding author: Iulia Coca Neagoe.)

Fig. 1 .
Fig. 1.Visualization comparison between R-G-B representation (first column) and latent values representation (second column) obtained with one of the methods proposed.First line shows an ambiguity reduction scenario and the second line a smoke reduction scenario.

Fig. 3 .
Fig. 3. Comparison between spectral signatures and latent SI-SE signatures.The first half of the image shows graphical representation of the spectral signatures emphasizing the different patterns of reflectance between different wavelengths.The second half of the image shows the compressed latent representation signatures obtained using SI-SE method.Each color line represents a band, x-axis illustrates the pixel number and y-axis the pixel values.

Fig. 5 .
Fig. 5. Comparison between spectral signatures and latent SI-SECC signatures.The first half of the image shows graphical representation of the spectral signatures emphasizing the different patterns of reflectance between different wavelengths.The second half of the image shows the compressed latent representation signatures obtained using SI-SECC method.Each color line represents a band, x-axis illustrates the pixel number, and y-axis the pixel values.

Fig. 7 .
Fig. 7. Comparison between spectral signatures and latent SI-PcE signatures.The first half of the image shows graphical representation of the spectral signatures emphasizing the different patterns of reflectance between different wavelengths.The second half of the image shows the compressed latent representation signatures obtained using SI-PcE method.Each color line represent a band, x-axis illustrates the pixel number, and y-axis the pixel values.
and the transformation of Y to polar coordinates, polar Y : error = MSE (polar X , polar Y ) .

Fig. 9 .
Fig. 9. Comparison between spectral signatures and latent PcI-SE signatures.The first half of the image shows graphical representation of the spectral signatures emphasizing the different patterns of reflectance between different wavelengths.The second half of the image shows the compressed latent representation obtained using the PcI-SE method.Each color line represent a band, x-axis illustrates the pixel number, and y-axis the pixel values.

Fig. 11 .
Fig. 11.Comparison between spectral signatures and latent SI-AE signatures.The first half of the image shows graphical representation of the spectral signatures emphasizing the different patterns of reflectance between different wavelengths.The second half of the image shows the compressed latent representation obtained using the SI-AE method.Each color line represent a band, x-axis illustrates the pixel number, and y-axis the pixel values.

Fig. 13 .
Fig. 13.SAE detailed architecture: four stacked fully connected AEs, having as input the reflectance values for one pixel.

Fig. 14 .
Fig. 14.Sentinel 2 clear scenes over Bucharest, Romania, respectively, Ravenna, Italy; Visualization improvement demonstration through comparison between the conventionally used representation, R-G-B, and the proposed methods.On each row is emphasized a subset of a scene and each column represents a visualization method as follows: first column-R-G-B representation, second column: SI-SE, third column: SI-SECC, fourth column: SI-PcE, fifth column: PcI-SE, sixth column: Si-AE.

Fig. 15 .
Fig. 15.Sentinel 2 entire smoky scenes over Chico, California, respectively, San Jose, California; Visualization improvement demonstration through comparison between the conventionally used representation, R-G-B, and the proposed methods.On each row is emphasized a subset of the scene and each column represents a visualization method as follows: first column-R-G-B representation, second column: SI-SE, third column: SI-SECC, fourth column: SI-PcE, fifth column: PcI-SE, sixth column: SI-AE.

Fig. 16 .
Fig. 16.Sentinel 2 scenes over Parma, Italy, Venice, Italy and, respectively, Kyiv, Ukraine; Visualization improvement demonstration through comparison between the conventionally used representation, R-G-B, and the proposed methods.First scene represents a foggy image, while the last two, cloudy images.On each row is emphasized a subset of the scene and each column represents a visualization method as follows: first column-R-G-B representation, second column: SI-SE, third column: SI-SECC, fourth column: SI-PcE, fifth column: PcI-SE, sixth column: Si-AE.

Fig. 17 .
Fig. 17.Full Sentinel 2 scenes comparison between R-G-B representations and proposed methods results.Proposed method result emphasizes in the first scene the burned area (dark blue), while in the second one highlights the water body.

TABLE I SENTINEL 2
PRODUCTS USED FOR EXPERIMENTAL RESULTS