Multitask GANs for Oil Spill Classification and Semantic Segmentation Based on SAR Images

The increasingly frequent marine oil spill disasters have great harm to the marine ecosystem. As an essential means of remote sensing monitoring, synthetic aperture radar (SAR) images can detect oil spills in time and reduce marine pollution. Many look-alike oil spill regions are difficult to distinguish in SAR images, and the scarcity of real oil spill data makes it difficult for deep learning networks to train effectively. In order to solve the abovementioned problems, this article designs a multitask generative adversarial networks (MTGANs) oil spill detection model to distinguish oil spills and look-alike oil spills and segment oil spill areas in one framework. The discriminator of the first generative adversarial network (GAN) is transformed into a classifier, which can effectively distinguish between real and look-alike oil spills. The generator of the second GAN model integrates a fully convolutional symmetric structure and multiple convolution blocks. Multiple convolution blocks can extract the shallow oil spill information, and the fully convolutional symmetric structure can extract the deeper features of the oil spill information. The algorithm only needs to use a small number of oil spill images as the training set to train the network, and the limitation of the oil spill dataset can be solved. Validation evaluations are conducted on three datasets of Sentinel-1, ERS-1/2, and GF-3 satellites, and the experimental results demonstrate that the proposed MTGANs oil spill detection framework outperforms other models in oil spill classification and semantic segmentation. Among them, the classification accuracy of the oil spill and look-alikes can reach 97.22<inline-formula><tex-math notation="LaTeX">$\%$</tex-math></inline-formula>. The average OA for semantic segmentation of the oil spill area can be 97.47<inline-formula><tex-math notation="LaTeX">$\%$</tex-math></inline-formula> and the average precision can reach 86.69<inline-formula><tex-math notation="LaTeX">$\%$</tex-math></inline-formula>.

the leakage of offshore oil drilling platforms. The threat of marine oil spills to the marine water ecosystem is more severe than other accidents [1]. Once an oil spill occurs in the ocean, it will not only damage the marine environment but also harm marine life. The use of satellite remote sensing technology to monitor marine oil spills can detect marine oil spills in time, deal with them quickly after disasters occur, and better reduce marine pollution. Among many remote sensing technologies, synthetic aperture radar (SAR) can observe the ground on all-day and has the sensitivity of oil film pressure wave effect, which has become an essential means of oil spill monitoring [2], [3]. The polarization mode of SAR image will affect oil spill detection. The VV polarization from the sea surface exhibits stronger backscattering than the HH polarization, and VV polarized images are much brighter than VH images due to their strong scattering. Thus, oil spills can be better detected using VV polarized SAR images [4]. The process of detecting oil spills based on SAR images is shown in Fig. 1. Marine oil spills appear in dark areas of SAR images due to the specular backscattering [5]. However, there will be a lot of look-alikes areas, such as upwelling, low wind speed areas, leeward areas, and biological slicks, also appearing as dark regions in SAR images [6]. Therefore, a crucial task is to correctly classify these dark areas into real or look-alike oil spills in the beginning.
Due to the development, the machine learning has been used by more and more researchers to solve many complex problems. Li et al. [7] used the support vector machine (SVM) model to detect oil slicks and similar objects with high accuracy. However, when using SVM for oil spill classification, it is necessary to This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ consider selecting an appropriate kernel function and its parameter configuration. Tong et al. [8] utilized the random forest to detect marine oil spills on polarimetric SAR data. Bandiera et al. [9] proposed a Bayesian edge detector that can be used to detect black spots on the ocean surface and, therefore, can be used as a first stage in identifying and monitoring oil spills. Basic image processing techniques, such as K-means clustering [10] to detect oil spills. Frate et al. [11] used neural networks for oil spill detection on ERS-SAR data. Li et al. [12] applied four types of machine learning models to study oil spill identification or classification and described the reflectance characteristics. The results obtained were consistent with those of previous studies. For the classical machine learning methods, feature extraction and selection of oil spills are carried out manually. These depend on a lot of expert experience and take a long time to process.
In recent years, deep learning neural networks have been applied to oil spill detection due to the advantages of automatical feature extraction and high detection precision [13], [14], [15]. Chen et al. [16] used two deep learning algorithms to identify oil spills and classify them as mineral, emulsion, and natural oil slicks. Yekeen et al. [17] developed a new deep learning oil spill detection model based on the mask region-based convolutional neural network model, in which, even in the case of overlapping, the target could be segmented with high accuracy. Zhang et al. [18] proposed a supervised detection network for petroleum context and boundary (CBD-Net) that fused multiscale features to extract oil spill areas. The performance of the CBD-Net network had the great improvement compared with the U-Net [19] model. Zhou et al. [20] proposed U-Net++ based on U-Net. Ma et al. [21] proposed an intelligent oil spill detection architecture based on a deep convolutional neural network, which performed better than traditional methods. Laurentiis et al. [22] studied oil slick classification under the framework of deep learning, and this classifier could separate mineral oil film from biological oil slick and clean ocean. Nieto-hidalgo et al. [23] proposed a ship and oil spill detection system based on airborne side-looking radar images. This method was composed of three pairs of convolutional neural networks, and the results had shown that the method was effective. Abdul et al. [24] studied the application of deep learning in oil spill detection and classification and modified U-Net for oil spill and similarity recognition. Xia et al. [25] proposed an SAR image segmentation method based on the principle of nonlocal processing with a multiscale active contour model, and the results displayed the effectiveness and feasibility of the method. Raeisi et al. [26] used an efficient cuckoo search algorithm and nonnegative matrix factorization of different Zernike moment features for discrimination between oil spills and look-alikes in SAR images. Aghaei et al. [27], [28] developed GreyWolfLSM and OSDES-Net methods for oil spill detection. Wang et al. [29] used the convolutional neural network AlexNet model to perform local connection, weight sharing, and learning representation of oil spill images to extract oil spill information from SAR images. Wang et al. [30] presented the MDOAU-Net model for SAR image segmentation in aquaculture raft monitoring. Two novel fusion networks were proposed for hyperspectral and SAR image classification [31], [32]. Deep learning methods [33], [34] achieved good performance in image segmentation, but there are still limitations. Most deep learning relies on a large amount of training data to ensure the accuracy of detection results, while in reality, there is a relative lack of a large amount of actual oil spill data.
To sum up, achieving accurate identification and detection is a problem in the case of limited samples. Zhang et al. [35] offered a graph information aggregation cross-domain few-shot learning framework for image classification. So far, some methods have been developed, such as metalearning [36], [37], metric learning [38], [39], transfer learning [40], [41], these methods mainly make full use of existing knowledge and experience to guide the learning progress of new tasks, but these methods rely on the feature representation ability of the model, which is complex and unintuitive. A straightforward way to solve the limited-sample learning problem is to increase the number of training samples through data augmentation. However, manual design is required in all of these methods [42]. As another data augmentation method, sample generation can solve the abovementioned problems. Sample generation cannot only expand the number of samples but also enrich the content of samples. These generated samples can improve the model is generalization ability and suppress the risk of overfitting. Generative adversarial networks (GANs) are an emerging type of generative model. GANs [43] were first proposed in 2014. Because the original GAN training is not constrained, Mirza et al. [44] improved the network and proposed conditionalGANs to guide the generator to synthesize the fake samples. Radford et al. [45] designed deep convolution generative adversarial networks, combining convolutional neural networks and GANs, effectively achieving high-quality image generation. Augustus et al. [46] offered semisupervised learning with generative adversarial networks (SGAN), an efficient classifier created. Gulrajani et al. [47] used Wasserstein GAN (WGAN) to solve the training instability problem. Later, an improved Wasserstein GAN-gradient penalty (WGAN-GP) [48] was presented based on the WGAN scheme. With the gradual development of GANs, the application fields have become more extensive, not only in data generation but also in image segmentation [49], [50], image classification [51], [52]. Other fields also have very amazing performance. GANs can expand the number of samples and increase adversarial samples to improve the ability of discriminative generalization. Only a small number of training samples are required during training. Therefore, GANs can also be used as a method for oil spill detection, Yu et al. [53] used a detection model with adversarial f-divergence learning for automatic oil spill identification. Li et al. [54] developed a multiscale conditional adversarial network (MCAN) using the GAN network. Only four oil spill images are needed in the network training, and the limitation that a large number of oil spill data are difficult to obtain can be solved. But it does not deal well with a variety of SAR images with small oil leak proportions.
Note that these GAN models do not consider how to distinguish real oil spills from look-alike oil spills, which is the most important part of oil spill monitoring. Moreover, it is hard for GANs to converge effectively because it is difficult to generate good adversarial samples. So, we make full use of these look-alike oil spills as effective adversarial samples to improve the classification and segmentation accuracy in the adversarial learning process, which can effectively combine the advantages of the GAN model with the special application of oil spills. In addition, the previous oil spill detection was a single task. It is not yet possible to simultaneously achieve the classification and segmentation of oil spills in one framework. How to design an end-to-end structure is more in line with the actual daily monitoring of oil spill satellite remote sensing. Therefore, this article proposes a multitask generative adversarial networks (MTGANs) oil spill detection model to distinguish real and look-alike oil spills and segment real oil spill areas in one framework. The main contributions of this study are as follows.
1) Design an MTGANs oil spill detection framework. The two GANs network models can accurately distinguish between real and look-alike oil spills, and the oil spill area can be segmented simultaneously for the first time. 2) Modify the discriminator of GANs. The discriminator is trained as an oil spill multiclassifier to discriminate real oil spill images, look-alike oil spill images, and pseudosamples accurately, while the generator is used to enrich similar samples to improve the discriminative ability of the model. 3) Integrate a fully convolutional symmetric structure and multiple convolution blocks in the oil spill semantic segmentation task. Multiscale oil spill features are extracted, respectively, to improve semantic segmentation accuracy with very little training data. The rest of this article is organized as follows. Section II introduces the related work related to the research direction of this article and provides prior knowledge. Section III introduces the proposed MTGANs model in detail for SAR image oil spill. Section IV shows the experimental results of the work. Finally, Section V summarizes the highlights of the article and future research.

A. Generative Adversarial Network
In 2014, Goodfellow et al. [43] first proposed GAN. It marks the official birth of GAN. In the original network, the GAN is a network composed of two models, one is the generator, and the other is the discriminator. The purpose of the generator is to learn the data distribution of real images and to try to generate real images to deceive the discriminator. The discriminator's goal is to distinguish the images generated by the generator from the real images. The generator and the discriminator constitute a dynamic game process. Eventually, in the ideal state, a Nash equilibrium state is reached. The generator generates enough fake images to fool the discriminator. For the discriminator, it is difficult to determine whether it is from the real data or the data generated by the generator. The GAN model framework is shown in Fig. 2. The generator inputs random noise z and generates pseudosamples G(z), then G(z) are fed into the discriminator together with the real data, and true or false is output by the discriminator. The output of the discriminator will guide the  generator in reverse so that the generator can generate more realistic images.
The traditional GAN network is a simple two-layer feedforward neural network, which is not combined with the convolutional neural network and cannot effectively achieve highquality image generation. The generated image samples are also unconstrained, making predicting the results difficult. Therefore, the oil spill segmentation task cannot be performed directly.

B. Semisupervised Generative Adversarial Network
So far, the goal of most GAN variants is to generate realistic data samples. Therefore, generators are always a concern. The main purpose of the discriminator is to help the generator to improve the quality of the generated images. At the end of the training, discriminators are usually ignored and only trained generators are used to generate real data.
The main concern in SGAN [46] is the discriminator, and the SGAN combines the discriminator and the multiclassifier into one. The discriminator here does not just need to distinguish between true and false but learns to distinguish n + 1 classes, where n is the number of classes in the training dataset, and 1 adds a class to the pseudosamples generated by the generator, the original binary classifier is transformed into a multiclassifier. The goal of the discriminator during the training process is to distinguish between classes. The purpose of the generator is to improve its classification accuracy by providing additional information to help the discriminator learn relevant patterns in the data. At the end of the training, the generator will be discarded, and the trained discriminator will be used as the classifier. The model framework of SGAN is shown in Fig. 3. The generator inputs random noise z and outputs the generated pseudosamples G(z). The input of the discriminator consists of three parts, the data with class labels, the data without class labels, and the pseudosamples generated by the generator. After the discriminator is discriminated, the output is the classification result.
Therefore, SGAN can be used in the research field of oil spill classification and recognition. A discriminator is trained to achieve the classification of oil spills and look-alike spills.

A. Overview
The model consists of three parts, as shown in Fig. 4. The first part has a generator and a discriminator. The generator's input is random noise, and the generator outputs pseudosamples. Thermal noise removal, coherent speckle filtering, and terrain correction are preprocessed for the oil spill image and look-alike oil spill image. The preprocessed image is fed to the discriminator along with the pseudosamples generated by the generator. After input discriminator, the discriminator realizes the oil spill classification, look-alike, and fake sample classification. The role of the second part is the connection between the first and third parts. It is screened by the results of the classification in the first part, and the first part is classified as an image of the oil spill, and its label value is compared. The groundtruth label is the image of the oil spill, and the filtered result is used as the input for the next part. The third part is composed of a generator and a discriminator. The generator is input results from the screening in the second part, and the output is the generated oil spill segmentation map. The real oil spill image, the generated segmentation map, and the segmentation label map are fed into the discriminator. The discriminator distinguishes the generated segmentation map from the real segmentation map. Reverse optimization of the generator according to the discriminant results so that the generator can generate a more realistic oil spill segmentation map and realize the semantic segmentation of oil spill images.

B. Oil Spill Classification Model
The oil spill classification model is built according to the idea of an SGAN. The classification model is shown in the first stage of Fig. 4. Use to classify oil spills and the like. The classification model consists of a generator G c and a discriminator D c , the discriminator is improved into a classifier for the oil spill classification task. The construction of the generator starts with the BatchNorm layer. The BatchNorm layer can solve the problem that the data distribution changes during the training process so as to prevent the gradient from disappearing or exploding and speeding up the training speed. Then there are upsampling layers, convolutional layers, BatchNorm layers, and LeakyReLU activation function. The function of the upsampling layer is to increase the length and width of the image and then input the convolution layer. To prevent vanishing gradients and overfitting during feature extraction, a normalization layer is added after the convolution layer. The LeakReLU activation function is added after the BatchNorm layer, which can make the network converge quickly. Then it also goes through the upsampling layer, the convolution layer, the BatchNorm layer, and a BatchNorm layer, and the LeakyReLU Algorithm 1: Classification model.
Input: x : real and look-alike oil spill images y : classification label z : random noise CE : cross entropy loss BCE : binary cross entropy loss Output: Classification result for epochs do Update G c end for activation function in sequence. Finally, the convolutional layer and the Tanh activation function. The discriminator consists of four convolutional blocks, each of which has a convolutional layer, a BatchNorm layer, a LeakyReLU activation function connected in turn. The output layer has a sigmoid activation function and a softmax activation function. The sigmoid function is used to distinguish between true and false, after the output of the sigmoid function, there will be two cases representing the true or false samples. The softmax function is used for multiclass classification. After the softmax function output, the discrimination between look-alike oil spill images, oil spill images, and fake samples can be achieved.
The generator and discriminator of the oil spill classification model need to input oil spill category information corresponding to the training data during the training process. Therefore, the oil spill classification model is a supervised model. The input to the generator is random noise z, and the output of the generator is the generated pseudosample. The discriminator inputs the pseudosample generated by the generator, the oil spills, and look-alike oil spills marked by the real category. The crossentropy (CE) loss function and binary cross-entropy (BCE) loss function are used to calculate the loss of discriminant results in classification models. CE is used after the softmax activation function and defined as where, f (x i ) is the predicted value of the output of the sample x after passing through the network. y represents the true category information corresponding to the corresponding sample x. The CE function was used to calculate the loss of multiple classification results of the discriminator, which improved the accuracy of the oil spill, look-alike spill, and pseudosample classification of the model. BCE loss function is defined as where, b is the true value of sample x, which is true or false.b is the value predicted by the network. BCE is used to distinguish between real samples and fake samples, and this loss function guides the generator to generate images closer to the real sample. So, the discriminator can distinguish between real and fake, and classify real and look-alike oil spills. The objective function of the oil spill classification model is defined as where, z represents random noises, x represents real and lookalike oil spill images, y is oil spill category information, and G c (z) is the generated pseudosample. The classification model training strategy is formally described in Algorithm 1. The discriminator is trained to maximize the discrimination ability. With the improvement of the discriminator, the true and false samples are distinguished more correctly. The generator is trained to the minimization to generate images as close as possible to the real oil spill and look-alike oil spill, and finally, successfully fool the discriminator.
After the final training, the generator is discarded, and the discriminator is retained, through which the purpose of classifying the oil spill image and the imitation oil spill image can be achieved. On the other hand, the convolutional parts of (the classification part) and (the semantic segmentation part) have the same structure. The trained D c weights are set to the D s initial weights. The effect of similar adversarial images is transferred to the semantic segmentation part to improve the convergence speed and segmentation accuracy of the algorithm. Therefore, the same structural parts of the two GANs models share weight information in the multitask framework.

C. Oil Spill Image Judgment
The judgment model is shown in the second stage of Fig. 4. Because the oil spill images identified in the previous step may have misclassifications of look-alikes oil spill images, to ensure that they are all oil spill images during semantic segmentation, a screening operation should be performed to eliminate the oil spill images misclassified as oil spill images. Here, the label value is used for the judgment operation. The label values are the oil spill and look-alike oil spill category labels in the first stage. If the true label value is also an oil spill image according to the discrimination result, this image will be retained. Otherwise, the image is rejected. After filtering, the data left are all oil spill images, which will be used as input for the next step.

D. Semantic Segmentation Model
The segmentation model is shown in the third stage of Fig. 4, and consists of the generator G s and the discriminator D s . The generator network first consists of four downsampled and four up-sampled fully convolutional symmetric structure networks, forming a U-shaped structure. The pooling layer and the doublelayer convolutional layer are the downsampling components, and two convolutional layers and a ReLU activation function Algorithm 2: Segmentation model. Input: I : oil spill images S : segmentation label Output Generated oil spill segmentation map S for epochs do form the double-layer convolutional layer. Upsampling consists of deconvolutional layers and double-layered convolutional layers. Two convolutional layers and a ReLU activation function are combined into two convolutional layers. For the four times of fully convolutional symmetric structure sampling, the image is scaled to a total of 16 times. Symmetrically, the upsampling is performed four times, and the obtained high-level semantic feature map is restored to the resolution of the original image. After four times of upsampling, the information, such as the edge of the oil spill segmentation map is also recovered more finely. The fully convolutional symmetric structure is added to the generator, which can extract the feature information of the oil spill from global to local and generate a more accurate oil spill segmentation map with the help of the discriminator. The fully convolutional symmetric structure network is followed by a convolutional layer and three convolutional blocks, each having a convolutional layer, a BatchNorm layer, and a LeakyReLU activation function. Finally, there is a convolutional layer and a Tanh activation function. The construction of the discriminator consists of four convolutional blocks, each containing a convolutional layer, a BatchNorm layer, and a LeakyReLU activation function. The last layer is a convolutional layer.
The segmentation model is trained using WGAN-GP loss [48] to stabilize the training. The objective function of the generator is shown in the following: where, λ 1 is the G s weight balance parameter, S is the input real segmentation label map, S is the segmentation map generated by the generator, and I is the real oil spill image. S − S 1 is the l 1 norm, which penalizes the pixelwise distance between the segmentation label map and the generated segmentation map.
Minimizing the training of the generator to generate a more realistic segmentation map of the oil spill can finally deceive the discriminator successfully. The objective function of the discriminator is shown in the following: where, λ 2 is D s the weight balance parameter, this item D s (I, S) − D s (I, S) represents the adversarial loss, which increases the discriminant's discriminative ability. The term λ 2 (∇ S D s S − S 2 − 1) 2 represents penalized gradient loss, which produces stable gradients that neither vanish nor explode.
S represents a random variable sampled uniformly between S and S. Minimize the training of the discriminator to distinguish the generated oil spill segmentation map from the oil spill segmentation label map.
The generator input is the real oil spill image, and the output is the semantic segmentation map of the oil spill image. The input to the discriminator is the real oil spill image, the semantic segmentation map generated by the generator, and the oil spill image segmentation label map. After being discriminated by the discriminator, the discriminant score is output. The discriminator then reverse optimizes the generator to guide the generator to generate a more accurate oil spill segmentation map. In the initial training of D s , the D c weights trained in the classification model are loaded, and the effect of similar adversarial images is transferred to the semantic segmentation part to improve the convergence speed and segmentation accuracy of the algorithm. At the end of the final training, the discriminator is discarded, and the generator is retained. The generator can generate an oil spill segmentation map of oil spill images. The purpose of semantic segmentation of the oil spill image is achieved. The segmentation model training strategy is formally described in Algorithm 2, the overall objective function of the segmentation model is shown in the following: After training, the segmentation model discards the discriminator and keeps the generator. The generator is used to generate the oil spill segmentation map to realize the segmentation task of the oil spill area. Finally, the discriminator of the classification model and the generator in the segmentation model are integrated to form an MTGANs, which is used for oil spill classification and semantic segmentation to form a complete oil spill detection framework. The process of the overall framework is the sequential learning method. After the classification and judgement part is completed, begin the semantic segmentation process. So learning errors in the semantic segmentation part are not backward to the judgment and classification models. However, the classification and segmentation models share the weights of the discriminator, whose convolutional parts have the same structure. D c weights after training are set as D s initial weights. The effect of similar adversarial images is transferred to semantic segmentation, which improves the convergence speed and segmentation accuracy. Therefore, the overall framework is not entirely multistage independent learning.

A. Dataset Descriptions and Preprocess
In order to prove the effectiveness of the proposed method, the Sentinel-1, GF-3, and ERS-1/2 satellite data are used to conduct comparative experiments. 1) Sentinel-1 data: Two satellites of the Sentinel-1 satellite carry SAR sensors, active microwave remote sensing satellites with a resolution of up to 5 m and a width of 400 km. Interferometric wide swath imaging mode is used, the polarization mode is VV polarization, and the angle of incidence is 20 • -45 • . In the experiment, the observed oil spill data are selected and cut after preprocessing. Cropped oil spill image size 256 × 256 pixels. 2) GF-3 data: The GF-3 satellite is China's multipolarization SAR satellite. It was launched in 2016, and the sensor was equipped with C-band. The width is 10-650 km, and the resolution can reach 1 m. The imaging mode is FSI, FSII, and SS. The angle of incidence of FSI and FSII is 19 • -50 • , and the incidence angle of SS is 17 • -50 • . The polarization mode is VV polarization. The oil spill data observed in the experiment are preprocessed and cropped, and the oil spill image size is 256 × 256 pixels. 3) ERS-1/2 data: ERS-1 and ERS-2 ESA were launched in 1991 and 1995, respectively, with sensors equipped with C-band. The polarization mode is VV polarization, and the angle of incidence is 23 • -60 • , and the width is 80-100 km. The observed oil spill data are selected and cut after preprocessing. Cropped oil spill image size 256 × 256 pixels. Preprocessing operations of thermal noise removal, radiometric calibration, coherent scatter filtering, and terrain correction are performed on the acquired SAR images. Thermal noise removal performing thermal noise removal can improve the thermal noise problem. Radiation calibration due to the penetrating nature of clouds, a radiation calibration operation needs to be done on SAR data. Radiation calibration can eliminate the sensor's error and determine the accurate radiation value at the entrance of the sensor. Coherent spot filtering is a common phenomenon in SAR images. There are various filters to remove coherent spots. Here, the refined Lee filter is used. This is the most commonly used coherent spot filter. The processing effect is excellent. The refined Lee filter has a filter window size setting set to 7 × 7 window size. The terrain correction ensures that the real geographic coordinates correspond to the remote sensing image, and the distance Doppler method is used to correct the terrain. The image is input into the model after the abovementioned processing.
In the experiment, the groundtruths of oil spill SAR images are collected through relevant news reports, literature publications, and our daily oil spill monitoring. In addition, look-alike areas can be identified by querying the synchronous marine environment, such as wind speed, sea surface temperature, and chlorophyll concentration [6]. The semantic segmentation groundtruths of oil spill SAR images are generated by LabelMe annotation tool.

B. Experimental Setup
All experiments are compiled under Windows 10 with python 3.8, pytorch 1.7.1, and cuda 10.1, run with GeForce RTX 2080 Ti GPU, and the input data size is 256×256 SAR images.
Classification model train generator G c and discriminator D c using Adam optimizer with b 1 = 0.5 and b 2 = 0.999, network learning rate is 0.0002. Classification model train 300 epochs. The classification model uses Sentinel-1 data with 120 images. The ratio of the training set and test set is set to 7:3.
Segmentation model train the generator G s and discriminator D s using the Adam optimizer with β 1 = 0.5 and β 2 = 0.999, the network has a learning rate of 0.0005 and a minibatch size of 1. The balance parameter of the 1 -norm constraint is λ 1 = 10, and the gradient penalty weight for the WGAN-GP loss is λ 2 = 10. The segmentation model is trained for 50 epochs. The training set and test set of the segmentation model include Sentinel-1 data, GF-3 data, ERS-1/2 data, of which Sentinel-1 data, ERS-1/2 data select four oil spill images as for training, 20 images are used for testing, four oil spill images are selected as training, and five images are used for testing GF-3 data.
Evaluate the performance of MTGANs on SAR images and separate the oil spill classification and segmentation models for their comparative experiments. The classification model is compared with the K-nearest neighbor algorithm (KNN) [55], SVM [7], random forest classifier (RFC) [8], decision tree classifier (DTC) [12], and the Bernoulli naive Bayes classifier (NBC) [9]. The segmentation model is compared with MCAN [54], SegNet [33], U-Net++ [20], MODAU-net [30], DeepLabv3+ [34], respectively. To ensure the reliability of the experiments, all comparative experiments use the same training set as training, and the same performance indicators are used for evaluation.

C. Evaluation Criteria
The accuracy rate is used to evaluate the performance of the classification model, which is the proportion of the total sample where oil spills and look-alikes oil spill images are correctly classified. The accuracy rate is shown in the following: where, a represents the number of correctly classified samples, and m represents the total number of samples. Calculate the OA, Precision, F1-score, Kappa, and MIoU based on the confusion matrix to evaluate the performance of the oil spill segmentation model. OA represents the proportion of the total correctly predicted in the generated oil spill segmentation map and defined in the following: Recall represents what fraction of the oil in the spill area has been detected and defined in the following: Precision represents what proportion of the oil spill area in the generated oil spill segmentation map is a real oil spill defined in the following: F1-score considers both precision and recall and defined in the following: MIoU represents each class's average intersection ratio and is often used as an evaluation index for semantic segmentation and defined in the following: Kappa is used for image consistency check and defined in the following: where D. Experiment 1: Sentinel-1 Dataset

1) Oil Spill Classification:
Conduct experiments in the oil spill classification model. The classification results for the six images are presented in Fig. 5. Among them, Fig. 5(a), (b), and (c) is judged as oil spill images, and Fig. 5(d), (e), and (f) distinguished as look-alike oil spill images.
To ensure the reliability of the method for oil spill classification model comparison experiments. The data of Sentinel-1 is used, and the ratio of the oil spill and look-alike oil spill is set to 1:1 and 1:2, respectively, in training set to conduct the experiment, and compare KNN [55], DTC [12], SVM [7], RFC [8], and plain NBC [9] methods. The experimental results are shown in Table I. From the experimental results, in the case of the 1:2 ratio of the oil spill and look-alike oil spill data, the models did not show excellent performance due to the unbalanced ratio of the oil spill and look-alike oil spill data. After balancing the dataset and adjusting the ratio of the oil spill and look-alike oil spill data to 1:1, the model performance of all six methods improved. The NBC model assumes that the attributes should be independent of each other, and when the number of attributes is relatively large or the correlation between attributes is large, and the NBC classification effect is not good, and the correlation between the oil spill and suspected oil spill data are large, so there are a large number of images misclassified in the NBC model in oil spill classification recognition. All four models, KNN, DTC, RFC, and SVM, show good performance. Since RFC uses an integrated algorithm, its own accuracy is better than most individual algorithms, so it has high accuracy in oil spill classification recognition. The MTGANs proposed in this article consist of a generator and a discriminator in the oil spill classification model. The discriminator learns the feature information of oil spill and look-alike oil spill images and learns to distinguish between oil spill images and look-alike oil spill images. The pseudosamples generated by the generator enrich the number of samples and improve the discriminator's discrimination ability. Therefore, this article proposes that MTGANs have the highest accuracy in an oil spill and look-alike oil spill classification recognition.
2) Oil Spill Segmentation: Experiment with the oil spill segmentation model, both the training set and the test set are Sentinel-1 data, only four oil spill images are used for training, and 20 oil spill images are used for testing. The same training set and test set are used to conduct comparative experiments with MCAN [54], U-Net++ [20], MODAU-Net [30], DeepLabv3+ [34], and SegNet [33] models, respectively, and the results are evaluated using five performance metrics.
The oil spill segmentation results are compared and analyzed, and the segmentation results of three oil spill images under five models are shown in Fig. 6. Most of the oil spill images selected in this experiment are oil spill images with relatively large oil spill areas and regular shapes. In using a small amount of dataset to train the model, U-Net++ also maintains excellent performance in this experiment because U-Net++ supports a small amount of data to train the model. Although SegNet, DeepLabv3+, and MCAN also perform well in this experiment, the segmentation accuracy is not as excellent as U-Net++. The lack of dataset leads to the relatively blurred edge information of the oil spill region in the SegNet method segmentation map. In contrast, the U-Net++ and MCAN methods perform better in the edge region, but the U-Net++ and MCAN methods may misclassify the sea surface as the oil spill region. The MODAU-Net method has optimized edge information compared with the SegNet method, but there are a large number of misjudgments in the prediction graph. DeepLabv3+ uses an encoder and decoder. The encoder provides high-level feature semantic information, and the decoder replies to boundary information step by step, which improves the segmentation effect while focusing on the boundary information. It makes the DeepLabv3+ method superior to the SegNet method in handling boundary information, but the SegNet method severely misclassifies the sea surface as an oil spill area. The MTGANs method proposed in this article integrates a complete convolutional symmetric structure and multiple convolutional blocks in the segmentation model. The multiple convolutional blocks can extract shallow oil spill information, and the full convolutional symmetric structure can extract deep features of oil spill information. The generator can extract global to local oil spill information by refining the edge information of the oil spill segmentation map with four times downsampling and four times upsampling, and together with the discriminator, it can play a guiding role in generating the oil spill segmentation map for the generator. Therefore, the accuracy of MTGANs proposed in this article is better than other models for oil spill segmentation. The five performance metrics of the three oil spill test images are compared, and the metrics results are shown in Table II, and the performance metrics in the table are consistent with our visual results. Therefore, the MTGANs proposed in this article have the best oil spill image segmentation performance.
The average performance indexes of the five methods on 20 oil spill test images are evaluated. The average performance indexes are shown in Table III. It can be seen from the table that the MTGANs proposed in this article is also the highest in OA, Precision, F1-score, Kappa, and MIoU performance index  average. The Kappa coefficient is often used to test the spatial consistency of image classification. The mean value of Kappa is 65.74%, when a Kappa value of 61%-80% is considered to be highly consistent, which is consistent with the results of the visual interpretation. Therefore, the MTGANs method proposed in this article can achieve high accuracy even with a small training set.  The MTGANs proposed in this article can be used for both oil spill classification and semantic segmentation of oil spill images. The overall experiments are performed on Sentinel-1 data. Forty images are selected as testing; there are 20 oil spill images and 20 oil spill similar images. The test image is input into the MTGANs framework for experiments. First, it passes through the discriminator D c of the classification model and outputs the classification results. The test results have an accuracy of 95% for the discrimination of oil spills and look-alike oil spill images. Then, the discrimination results are screened and filtered. Eighteen images are identified as oil spills, and the real labels are also oil spills. The generator G s of the 18 oil spill image segmentation model is used for semantic segmentation. Generate oil spill maps and calculate the average performance index of 18 oil spill images. The average performance indicators are shown in Table IV.
Some of the current oil spill detection methods can only perform oil spill classification, some can only perform oil spill segmentation, and few can achieve both oil spill classification and experimental oil spill segmentation. Therefore, a comparative test of oil spill classification and segmentation is carried out to evaluate the performance.

E. Experiment 2: ERS-1/2 Dataset
An oil spill segmentation experiment is conducted on ERS-1/2 satellite data. Four oil spill images are also selected as the training set and the rest as the test set, and the experimental results are shown in Fig. 7. In terms of visual effect, the segmentation effect of oil spills with thin strip shapes is not too accurate, compared with the MTGANs method proposed in this article, the segmentation similarity is relatively high.
Three images are selected from the ERS-1/2 test data for analysis, namely, the long strip-shaped oil spill image, the small area oil spill image, and the complex and elongated shape oil spill image. The experimental results of this method are shown in Fig. 7. Since most of the selected oil spill images in this experiment are small and irregularly shaped oil spill areas, although U-Net++ supports a small amount of data to train the model, the oil spill images of small areas will lead to training difficulties. Therefore, the segmentation performance of U-Net++ is low in this experiment, and the same situation occurs in SegNet and DeepLabv3+. From the three result images, the U-Net++, SegNet, and DeepLabv3+ methods can only segment the general shape of the oil spill, and the segmented oil spill area is not obvious, as shown in Fig. 7(b). The MODAU-Net method has excellent segmentation performance on ERS-1/2 dataset, but it will miss the edge information of oil spills. For the small area oil spill image, the shape of the whole oil spill area cannot be completely segmented in the segmentation result. The segmentation results for complex oil spill images cannot completely segment the shape of the entire oil spill area, while the slender shape of the oil spill cannot be segmented at all. Compared with the other three methods, MCAN outperforms U-Net++, SegNet, and DeepLabv3+ in segmenting oil spill images of small regions because it uses a multiscale strategy with cascaded coarse-to-fine data streams to enhance the representation of the model. However, MCAN can have incorrectly segmented areas, incorrectly segmenting areas that are not oil spills as oil spill areas. The MTGANs method proposed in this article is integrated with a fully convolutional symmetric structure and multiple convolutional blocks due to the integration in the generator. It can capture oil spill information from shallow to deep layers. Therefore, MTGANs can achieve oil spill image segmentation of small areas and reduce the misclassification of oil spill areas. The experimental results show that the MTGANs proposed in this article outperform other models in segmenting oil spill images with small areas and irregular shapes. The performance indexes of the three oil spill images are shown in Table V. From the data Combined with the average performance index, the average performance index of 20 oil spill images is shown in Table VI. It can be seen from the table that the F1-score, Kappa, and MIoU index values of U-Net++, SegNet, and DeepLabv3+ methods are generally low because of the small data training set. Four training sets cannot guarantee the accuracy of prediction segmentation, and there will be inappropriate predicted images in the test results of U-Net++ and DeepLabv3+ methods. The similarity between the segmentation result map and the segmentation label map is almost 0, which is why the performance indicators of U-Net++ and DeepLabv3+ methods are low. Compared with them, the SegNet method does not have a complete prediction error, but the segmentation effect is not excellent. Both the MCAN method and the MTGANs method proposed in this article can solve the limitation of the lack of data and achieve good segmentation results with a small number of training sets. However, the similarity of the segmentation maps generated by the MCAN method is not high, and there are nonoil spill regions segmented into oil spill regions. Therefore, the MTGANs method proposed in this article has a better segmentation effect on the oil spill area.

F. Experiment 3: GF-3 Dataset
Due to the lack of a GF-3 dataset, nine oil spill datasets are selected for the oil spill segmentation experiment. Among the nine oil spill datasets, four oil spill images are used as the training set, and five oil spill images are used as the test set.
The test results of the three GF-3 oil spill images are shown in Fig. 8. Due to the lack of GF-3 data, there will be images with bad oil spill imaging and more noise interference in the dataset, and the U-Net++ method still has excellent segmentation performance for large area oil spill images, as shown in Fig. 8(a), which is also consistent with the results of Experiment 1. The segmentation performance is low for the images with bad oil spill imaging and more noise interference, as in Fig. 8(b). The MODAU-Net method has better segmentation performance than U-Net++ for complex oil spill images, but the segmentation consistency is low, and the oil spill cannot be completely segmented. The SegNet method has the worst segmentation results, which again proves that the SegNet method cannot guarantee the accuracy of oil spill segmentation for small data in the absence of datasets. This is also consistent with the results of the first two experiments. DeepLabv3+ uses the decoder to gradually return the boundary information, which improves the segmentation effect while paying attention to the boundary information. However, in this experiment, due to the lack of a training set, DeepLabv3+ only retains the boundary information and loses the internal information. MCAN outperforms U-Net++ in segmenting images with more noise interference due to the multiscale and discriminator, but MCAN still has incorrectly segmented areas, which is also consistent with the results of the first two experiments. The MTGANs method proposed in this article works well for the segmentation of oil spill images with large or small areas and high noise impact. The performance index results of the three test images in Table  VII, again our judgments on the visual effects are consistent. Therefore, the MTGANs proposed in this article show excellent  performance in oil spill images with a large percentage of oil spill area, complex oil spill shape, small area, and high noise interference. The average performance of the five GF-3 oil spill test images is compared, and the average performance indicators are shown in Table VIII. Overall, the MTGANs method proposed in this article has the highest Precision, F1-score, Kappa, and MIoU. Also, it is again proved that the MTGANs method proposed in this article has the best performance compared with small data segmentation performance, which is also consistent with the conclusions of Experiment 1 and Experiment 2.

G. Ablation Study
The discriminator of the classification model is composed of multiple convolutional blocks. In order to show the effect of the number of layers of convolution blocks on the recognition accuracy of oil spills and look-alike oil spills, the number of layers of convolution blocks is varied and set to 3, 4, and 5, respectively, and the classification results are tabulated in Table IX. It can be observed that the number of layers of the discriminator is not necessarily better the higher the performance. Through experimental comparison, it is found that the highest accuracy in identifying oil spills and look-alike oil spills is achieved when the number of layers of the discriminator is 4.
In the model, the discriminator D s uses multiple layers of convolutional blocks, and the generator G s has multiple convolutional blocks connected after the full convolutional symmetric  structure in the generator. In order to show whether the number of convolutional blocks affects the oil spill segmentation effect, the number of layers of convolutional blocks of the discriminator is set to 3, 4, and 5, and the number of layers of convolutional blocks of the generator is set to 2, 3, 4, and 5, respectively. The segmentation results are tabulated in Table X. It can be observed that a higher number of generator layers does not necessarily lead to better performance, as a higher number of generator layers results in a longer training time and a tendency to overfit the network. A smaller number of layers does not help the network to learn the feature information. The discriminator also has an impact on the segmentation accuracy, but not as much as the generator. In contrast, by combining F1, Kappa, and MIoU coefficients for a comprehensive analysis and comparison, it is found that the segmentation network model is optimal in terms of F1, Kappa, and MIoU coefficients when the discriminator is four layers and the generator is three layers. This also shows that setting the discriminator to 4 and the number of layers of the generator to 3 in the model is crucial to obtain a good segmentation.
In addition, the experimental comparison is carried out by adjusting the values of λ 1 and λ 2 parameters under the Sentinel-1 data. A total of 20 oil spill images are selected, and the segmentation results are analyzed by combining the evaluation index. The values of the parameters are set to 8, 9, 10, and 11, respectively. Finally, the experimental results are tabulated in Table XI. When λ 1 = 10, λ 2 = 10, the segmentation performance of the oil spill is optimal.

H. Running Time
The running time of the model is calculated and tested with 20 images, and the classification and segmentation models are run five times each and averaged. The average running time of the classification model is 2.10 s, and the running time of the segmentation model is 5.91 s. For each image, the time to identify the oil spill and look-alike oil spill is about 0.105 s, and the time to segment the oil spill area is 0.2955 s. Therefore, in case of an unexpected oil spill, the model can quickly identify the oil spill and look-alike oil spill and can accurately determine the extent of the oil spill. The model can promptly identify the oil spill and look-alike oil spill and accurately determine the oil spill area. The model can be used to identify oil spills and look-alike oil spills quickly and determine the extent of oil spills accurately so that the oil spill disaster can be dealt with in time and the pollution to the marine environment can be reduced.

I. Discussion
The oil spill classification experiment is carried out on the Sentinel-1 dataset. MTGANs successfully classify through the discriminator, while the generator enriches the number of adversarial samples. A lot of look-alike oil spill SAR images are fully used in model learning, significantly improving the model's discriminant ability. It is seen that accurate classification of real and look-alike oil spill images can be achieved from Fig. 5. There is no prior salient feature selection and marine physical environment element query. Therefore, MTGANs has strong discrimination ability compared with traditional machine learning methods, such as SVM, KNN, and RFC.
The segmentation experiments of oil spills are performed under three different oil spill datasets, Sentinel-1, ERS-1/2, GF-3. U-Net++, MODAU-Net, SegNet, DeepLabv3+, and MCAN methods are used for comparative analysis. The segmentation effects of all algorithms are basically consistent on the three datasets and will not be affected by different satellite imaging effects. MODAU-Net utilizes offset convolutional blocks to convert spatial information into channel information, which can accurately segment oil spill fuzzy boundaries, as shown in Fig. 7(a) and (b). But there are still a large number of areas without oil spills are predicted as oil spills in Figs. 6(b) and 8(b). The U-Net++ method resets the jump path based on U-Net and can integrate the features of different levels, so the features of different levels of the oil spill are extracted. However, U-Net++ is difficult to predict the oil spill image with large noise interference, and the oil spill area cannot be predicted in Fig. 8(b). The SegNet method is challenging to train under a small number of datasets, resulting in difficulty predicting the results, especially in the edge areas. SegNet makes unpredictable results in Figs. 7 and 8, and fuzzy boundaries in Fig. 6(a) and (b). The DeepLabv3+ method adopts dilated convolution to shrink the feature map. Then it uses the bilinear interpolation upsampling to restore the original resolution so that the operation cannot recover the lost information. The information on the oil spill is largely lost in Figs. 7(c) and 8(a). Compared with other methods, the results generated by the MCAN method are closer to the groundtruth. However, the generator employs a simple and lightweight network in the MCAN model, so it cannot fully capture the overall feature information of the oil spill. MCAN cannot maintain excellent accuracy, so misjudgments will also occur when extracting oil spill information. The MTGANs proposed in this article designs a fully convolutional symmetric structure to the generator in the segmentation stage, which can extract global to local oil spill information. It not only maintains the overall characteristics of the oil spill without losing the details but also reduces the misclassification of the oil spill from the overall to the local under the action of the discriminator. The results of three experiments are consistent with the theory analysis. Compared with other methods, MTGANs can maintain the details of oil spill edge features and accurately predict the oil spill area in Figs. 6 and 7, respectively. MTGANs still has the best accuracy facing with considerable noise interference, as shown in Fig. 8. Therefore, the proposed MTGANs can maintain excellent performance when classifying and segmenting oil spills.

V. CONCLUSION
In this article, a novel MTGANs is proposed for oil spill classification and segmentation. Different from previous works, the proposed method integrates oil spill classification and segmentation into a single framework. In the classification stage of MTGANs, the discriminator is key, and the generator is used to enrich similar samples to improve the discriminative ability of the model. In the segmentation stage, the fully convolutional symmetric structure is added to the generator to extract the oil spill information from the global to the local oil spill. It not only maintains the overall characteristics of the oil spill but also does not lose the details and dramatically reduces the error classification under the role of the discriminator. Therefore, MTGANs can not only solve the problem that oil spills and look-alike oil spills are difficult to distinguish but also solve the lack of restrictions on oil spill datasets and reduce the misjudgment rate to achieve accurate segmentation of oil spills. Experimental results on three widely used satellite datasets verify the effectiveness of the proposed method. MTGANs is superior compared with the current state-of-the-art methods.