Statistical Adaptation Loss Improved SMALL Sample Ship Detection Method Based on an Attention Mechanism and Data Enhancement

Gao, Wei; Liu, Yunqing; Zeng, Yi; Li, Qi; Liu, Quanyang

doi:10.3390/app13042520

Open AccessArticle

Statistical Adaptation Loss Improved SMALL Sample Ship Detection Method Based on an Attention Mechanism and Data Enhancement

¹

Department of Information and Communication Engineering, School of Electronic Information Engineering, East Campus of Changchun University of Science and Technology, 7089 Weixing Road, Changchun 130022, China

²

Department of Robotics, School of Electronic Information Engineering, East Campus of Changchun University, 6543 Weixing Road, Changchun 130022, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(4), 2520; https://doi.org/10.3390/app13042520

Submission received: 28 December 2022 / Revised: 11 February 2023 / Accepted: 12 February 2023 / Published: 15 February 2023

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Synthetic aperture radar (SAR) imagery is a promising data source for ocean activity detection. Ship target detection based on SAR images is widely used in maritime trade and the military. SAR image data are rare, and the amount of public data is small. For applications of SAR image ship target detection, a model with low data dependence, fast iteration and low training cost is needed. In this paper, the balanced positive and negative data enhancement method was used. Through statistical analysis of the training dataset, similar sea areas in the training set are filled with detection targets with comfortable size features. Increasing the proportion of positive samples in the data helps to improve the model detection effect. The regional attention preadaptation mechanism based on statistical analysis was implemented to extract information, and the scale-adaptive loss was combined to improve the detection accuracy of the model. Using the same data, our model exhibited better performance. When using 30% of the data, our model was stable in terms of accuracy and average precision (AP) and maintained detection results similar to the training results achieved using 100% of the dataset.

Keywords:

SAR imagery; ship detection; regional attention; data enhancement; statistical adaptation loss

1. Introduction

Synthetic aperture radar (SAR) [1] images are a promising data source for ocean activity detection. They are widely used in maritime trade and vessel activity and monitoring. SAR images provide all-weather and all-day imaging information and are not affected by meteorological conditions. Different from optical remote sensing images and multispectral images, radar images with higher resolution can still be obtained in conditions with interference, such as clouds and dense fog, which are more suitable for marine climates. Currently, using SAR images as data sources is an important method for ship target detection [2].

SAR image data are precious and not readily available. However, in the field of remote sensing detection, compared with other data sources, SAR images have many unique characteristics as data sources, which are described as follows:

(1): With the advancement of observation technology, several high-resolution SAR imaging satellites have been put into use, and the amount of SAR data has continued to increase. However, compared with conventional optical images, SAR images are more difficult to interpret, their features are not intuitive, and various coherent speckles and overlapping phenomena increase the difficulty of target interpretation. It is very difficult to obtain accurate and large-scale SAR data sources, and many experienced researchers spend considerable time doing so. The existing database of public SAR images of ship targets is very valuable [3].

Many scholars have investigated how to use the existing data to train a detector with better performance and stronger adaptability. Reasonable data augmentation before training the model is an important means to improve the convergence speed of the model and improve the detection efficiency. In this paper, we carry out targeted, scientific and practical data enhancement according to the characteristics of SAR ship targets; this is used to increase the proportion of positive sample data and to help design detectors with better performance.

(2): The ship target detection algorithm of SAR images is suitable for real-world applications with limited data because it reduces data dependence.

In the field of deep learning, a common method to improve the model detection effect is to increase the amount of data for training the model. In recent years, to promote research on SAR images in ship target detection, domestic and foreign research institutions have successively released multiple data sources, such as the Sentinel-1 Ship Interpretation Dataset (OpenSARShip) [4,5]; SAR Ship Detection Datasets (SSDD/SSDD+) [6,7,8,9,10,11,12]; High-Resolution SAR Ship Objection Detection Dataset version (AIR-SARShip2.0) [13]; SAR-Ship-Dataset [14]; and High-Resolution SAR Image Dataset (HRSID) [15]. These publicly available data effectively alleviate the difficulty of ordinary researchers in obtaining data. Consequently, with so many public datasets, there are more than 27,400 images available. However, this is a small number compared to the common deep learning datasets that often include hundreds of thousands or tens of millions of images. Therefore, when detecting ships in SAR images, researchers are concerned with finding ways to use less data to obtain better detection results.

The original goal of this paper was not to design a detector with a better detection rate or higher accuracy but to develop a method for using less data to obtain the same or similar detection results. To reduce the data dependence of the model, 30% of the data in the SSDD training dataset were used for comparative experimental research. Model uses separately trained with the complete training set and trained with 30% of the training set. The two models trained under different conditions were compared under the same tuning condition, obtain the average value of multiple experimental tests. Our goal is that the final designed model can achieve similar or close detection accuracy to the model trained using the full training dataset when using less training data. Comparisons of the general detection algorithms in the field of machine vision indicate that the design is more suitable for the characteristics of the SAR image ship target for data enhancement and model optimization. At the same time, it provides ideas for the design of other special target detectors.

(3): The SAR image detection system requires continuous detection, and a fast, iterative system can reduce the training cost of the detection model.

Surface ship detection is important in both civilian and military fields. It has great application value for managing ships entering and leaving ports, combating smuggling, and monitoring marine fisheries and ship oil spill pollution. These practical applications all require a simple and fast ship detection process. In real-world applications, the acquisition equipment continuously obtains new data that need to be relabeled and screened, requiring the detection system to train the model for a long time and be able to iterate rapidly and continuously to optimize the model. Therefore, the detection system requires continuous detection and continuous iteration to adapt to the updated data. If less data are needed during training, it can take less time to iterate faster and adapt to the updated data.

The main contributions of this paper are as follows:

(1): Through balanced positive and negative data enhancement, in the case of the same amount of data, the data performance ability is improved so that the model can mine deeper data associations. When the amount of training data is reduced to 30%, the balanced positive and negative data enhancement can still maintain a detection effect close to the training result achieved with 100% of the dataset.
(2): The accuracy of ship target detection is improved by using scale-adaptive loss.
(3): Using the regional attention preadaptation mechanism based on statistical analysis to extract information, combined with the scale-adaptive loss, the detection accuracy of the model is improved.

This paper is organized as follows: the second section introduces the related work in the field of ship target detection; the third section describes an improved data augmentation method to achieve ship target detection based on a small amount of data; proposes scale-adaptive loss; and a spatial attention mechanism based on statistical analysis to solve the problem of multiscale ship target detection. In Section 4, the experimental results are presented and discussed. After the results of the paper are presented, they are further discussed with concluding remarks in Section 5.

2. Related Work

2.1. Object Detection Method Based on a Deep Convolutional Neural Network

Since the United States launched the first SAR satellite in 1978, scholars from various countries have proposed various ship detection methods for SAR images. Scholars from various countries have proposed traditional ship detection methods based on CFAR [16], visual saliency [17] and polarization decomposition [18] for SAR images.

Deep learning improves traditional object detection methods [19]. The current deep learning-based target detection methods are divided into two-stage target detection and one-stage target detection. The fast region-based convolutional neural network (Fast R-CNN [20]) is a typical two-stage object detection method. Compared with traditional object detection methods, the speed and accuracy of two-stage object detection have been greatly improved, but they still cannot meet the real-time requirements.

The algorithm based on convolutional neural network (CNN) has the ability to automatically extract deep features of images, showing high robustness and efficiency. Researchers try to use the excellent characteristics of CNN to promote CFAR. Kang et al. creatively used the region proposal generated by Faster R-CNN [21] as the protection window of CFAR, and combined Faster R-CNN with CFAR to detect small-sized ships [22]. Liu et al. used the CNN (SLS-CNN) detector based on land–ocean segmentation for accurate ship detection [23]. These experiments verify the feasibility of CNN in SAR ship detection. The structure of You Only Look Once (YOLO) [24,25,26] is that of a typical one-stage object detection method. The one-stage target price detection algorithm ignores the candidate frame generation stage, regards location information as potential targets, and tries to divide each area into detection targets and backgrounds. YOLOv5 [27] with a lightweight model size is basically on par with YOLOv4 [28] in terms of detection accuracy.

In recent years, artificial intelligence technology has emerged in the field of image target detection [29,30]. With the introduction of machine learning methods in the field of radar remote sensing, many new solutions and ideas have been introduced into the field of SAR image ship target detection.

Chang et al. [31] proposed an enhanced graphics processing unit (GPU)-based deep learning method for ship detection in SAR images. Kun and Yan [32] proposed an improved YOLOv4-Tiny detection algorithm. The improved algorithm introduced attention mechanism units to strengthen feature extraction, which was more prominent.

2.2. Attention Mechanism

After designing different network structures and developing training strategies for the model, the performance of ship detection in SAR images was significantly improved. However, there are still some difficulties in the detection of multiscale ships. Therefore, global features and suppressing unnecessary and confusing information are the keys to improving the performance of ship detection in SAR images.

When processing information, the attention mechanism only focuses on the part of the regional information that is beneficial to task realization, which not only describes the focus of the model, but also improves the representation of features. The importance of attention mechanisms has been extensively studied in the literature [33].

Zhang et al. [10] designed a lightweight nonlocal attention module embedded side-by-side into the stem network to suppress background disturbances. Chen et al. [34] designed a new strip pool-based attention mechanism and attached it to the extracted features to enhance the representation.

The above SAR ship target attention mechanism was intended to imitate human vision, focus attention on the sea area where ship targets are concentrated, eliminate land scattering and other scattering interference, and reduce the scale sensitivity of the network.

In view of the characteristics of the ship target in the SAR image, the characteristics of the data image were analyzed. A common scenario in the SSDD dataset is shown in Figure 1. In Figure 1a, the ship target is near the shore, with high target density and strong land scattering around it. In Figure 1b, the ship target appears in the ocean far from the land, and the target scattering is less. As shown in Figure 1, different data sets show different characteristics. This part will be introduced in the following content. Ship targets appeared in the water, and after land interference was ruled out, there were still large areas. If all the waters in the picture were traversed and detected, the detection efficiency would be greatly reduced. We proposed an improved spatial attention mechanism, which is improved according to the characteristics of SAR images and improves the detection efficiency while ensuring the detection effect.

2.3. Data Augmentation

In object detection, some methods are often used to enhance the data. Commonly used methods include image flipping, random scaling and mirroring, deformation, rotation, distortion, amplification, adding noise, and other means to increase the number of samples. Increasing the number of small samples by means of data augmentation helps improve the detection performance.

To improve the practical application effect while analyzing and processing remote sensing images, many researchers have focused on the shortcomings of current deep learning-based object detection algorithms. Based on the characteristics of remote sensing technology, some structural refinements were made to the algorithm. In SAR target detection, Wang et al. [8] used transfer learning and data augmentation due to insufficient labeled images for training. A new training image was generated by artificially extracting subimages and adding noise, filtering, flipping and other image processing methods to the original training image.

To improve the contrast of ship targets to clutter, Ai et al. [35] proposed an improved superresolution generative adversarial network (ISRGAN)-based blur suppression algorithm for SAR ship target contrast enhancement. The above data enhancement was mainly processed for the entire image. The amount of training data increased, but the ratio of positive to negative samples did not change.

Kisantal et al. [36] proposed a data augmentation strategy for small objects. Images containing small objects were oversampled, and small objects were enhanced to encourage the model to pay more attention to small objects. This enhancement was mainly aimed at visible light images, and there were many types of small targets. In the enhancement, only the number of small targets was increased, and the characteristics of the targets were not analyzed. The likelihood of a target appearing was enhanced.

In future research, data enhancement will be carried out according to the characteristics of ship targets in SAR images, and the balance of positive and negative samples will be optimized by improving the enhancement method.

2.4. Introduction to the Generic SAR SHIP Dataset

Usually, to obtain better detection results through deep learning methods, a large number of training samples are needed. As mentioned above, the existing public datasets of SAR image ship data mainly include OpenSARShip [4,5], SSDD/SSDD+ [6,7,8,9,10,11,12], AIR-SARShip2.0 [13], SAR-Ship-Dataset [14], and HRSID [15]. Each of these datasets had its own characteristics.

Each image in the AIR-SARShip series of datasets covered a large area, including both distant sea scenes and docks, roads, and nearby buildings.

The HRSID dataset consists of 5604 panoramic SAR images of 800 × 800 pixels. The offshore scenes with ships distributed in the sea are the main component of HRSID. Ship detection in inshore scenes is influenced by man-made facilities or buildings. The inshore scenes are regarded as the interferential scene to maintain a certain amount in HRSID. While the challenging scenes, such as the adjacent ships, cluster-distributed small ships in the canal, and large size ships are added to HRSID. The HRSID dataset was similar to AIR-SARShip. The dataset had a larger image size, and larger width. The datasets were suitable for training detectors trained on large scenes with wide banners.

OpenSAR Ship2.0 contained 11,346 image slices; each image contains only one ship target, and the slice image size ranged from 1 K–30 K. The SAR-Ship-Dataset was similar to OpenSAR Ship 2.0, and both contained images in dual polarization mode. It was suitable for analyzing the training models with different polarization modes. Li et al. [5] proposed that OpenSARShip 2.0 has special features, as shown in Table 1. This will have a certain impact when training the model later.

3. Materials and Methods

3.1. The Dataset Used in this Study

The initial version of the dataset used in this paper—the SSDD dataset—was released in 2017 and is the earliest publicly available dataset. In 2021, SSDD+ was corrected [6,31,32,33,34,35,36]. The improved SSDD+ dataset—after removing missing annotations and false annotations—contained rich sample types.

The number of samples in this dataset was moderate, making it conducive to model debugging and improving work efficiency. Many scholars have used the SSDD+ dataset to conduct experiments; because of the diversity of its data, it is helpful for establishing a reliable prediction model.

In our experiment, only 30% of the data in the SSDD dataset were used for training to reduce the data dependence of the model. This enabled the model to keep the detector results stable in scenarios where the actual data used are insufficient.

3.2. Balanced Positive and Negative Data Enhancement

In the initial experiments with ship targets on SAR images, detection errors often occurred. Specifically, there were two kinds of errors: when there was a target, but there was no detection or missed detection (Figure 2a), and when no target was present (Figure 2b), but a target was detected (false detection). Statistical analysis of these error types found that the latter outweighed the former.

We analyzed the causes of the above two types of error. Most of the area in the images was land or ocean, and the detection target represented a very small part. During detection training, all oceans and land were negative samples for ship target detection. Positive samples, on the other hand, only accounted for a very small proportion of the images. To obtain the ideal detection rate, it is necessary to increase the proportion of positive samples in the training data, that is, to increase the number of ship targets in the images.

When performing data enhancement, it is necessary to consider the balance of positive and negative samples and the characteristics of ship targets in SAR images.

Ship targets have distinct characteristics:

Ship targets only appear in the open sea or near shore.
In a large area of sea, ships have a fixed travel route. The common routes include relatively dense ship traffic near the coast, straits, and seaports. Furthermore, in a vast region of the sea, ships have a higher probability of appearing near the travel route.
Unlike natural images, the targets not only account for a small proportion in the picture but also have a small target size. According to the characteristics of the SAR image acquisition sensor, the target size has a certain distribution interval.

When the data enhancement method addresses the above points, it is possible to train a feature detector that is suitable for SAR ship targets and ensures the balance of positive and negative samples.

Based on the characteristics of SAR image data, two kinds of data enhancement were carried out.

Method 1: Enhancement of target data based on statistical analysis.

In this method, the appearance of ships is counted, and regional enhancements are performed. In some characteristic sea areas, such as locations close to land and between straits, ship targets are more likely to appear. First, the locations where the target ships appear in the SAR image are counted; then, the ship targets in the sea areas where there are many ships in the sea area are enhanced, thus increasing the number of ship targets and the number of positive samples.

Method 2: Data augmentation based on statistical analysis of target size.

When the ship target is added, it cannot be similar to the target enhancement method of natural images, and random scale transformation cannot be carried out. Targets that are too large or too small do not meet the characteristics of ship targets. Therefore, when performing data enhancement, the size of the target is considered by referring to the size of real targets before performing new target enhancement.

The overall data augmentation process is as follows:

In the first step, we count the value distribution in the prelabeled data frame to determine whether there is a ship. Then, we randomly select a small area and determine whether the value of this part is a composite ocean through the previous statistical analysis value. If compounded, the marked objects will be filled and marked. The last step is to adjust the size and shape of the ship while filling in the ship target so that the size of the ship is between the maximum size and the minimum size of the existing target on the map.

In Figure 3, the experimental results are enhanced with data for positive and negative sample equalization.

It can be seen from the figure that data enhancement processing is performed on the ship target in different scenarios and different sea conditions. Adding nonoverlapping ship targets of suitable size near existing ships. According toas we originally set, the newly enhanced target brightness and shape characteristics conform to the existing target characteristics. The number of positive samples in the enhanced data is increased, which helps to improve the accuracy of the detection network. In the subsequent experiments, 30% of the enhanced data are used each time as the next model training data. The detection effect of the trained model is tested through multiple cross-validation experiments.

3.3. Scale Adaptation Loss

The purpose of convolution is to extract features, but in the process of extracting features, information is lost. The ideal situation is to extract critical information and ignore noncritical information. This ideal is difficult to achieve, and the loss of information is unavoidable. To minimize the loss of key information, it is necessary to effectively discriminate between critical information and noncritical information. For the detection task in this paper, there are obvious differences in the corresponding object scales, but in the calculation process, the same weight optimization was carried out on the features of different scales, increasing the computational burden of the system and leading to a lack of precision. Different numbers of convolutions will result in differences in the fineness of features. The more convolution processing is performed, the more obvious the feature retention is, and the effect is optimized for small targets. Similarly, large targets need to be close to the tail-end features.

In practical applications, the acquisition system continuously acquires new image data, and the system needs to iterate quickly to adapt to the new data.

A new LOSS was designed for the ship target characteristics of SAR images. The specific process is described as follows:

The first step is to calculate statistics based on 30% of the data and count the size and number of ship targets in the data source.

In the second step, the collected targets are divided into three categories according to their size. The smallest 30% are regarded as small targets, the largest 30% are regarded as large targets, and the remainder are regarded as medium targets.

The third step is to set the attenuation factor according to the overall distribution of the target in each image. For example, if there are many small targets in a certain picture, the underlying structure in the convolutional layer should play a greater role, thereby reducing the information loss of small targets. The attenuation factor is adjusted according to the size of the target. The specific mediation method is as follows: for the feature output of the three scales in the YOLO structure, when there is no target, in the process of calculating loss, it corresponds to targets of different sizes multiplied by a and β, and the 1-α-β decay factor constrains the features.

The main goal of loss is balance. Due to the addition of loss design, the target of the corresponding area will produce a large loss, so that it accounts for a relatively high proportion. We focus on optimizing it to make it more balanced. Through adaptive design, each part of the model takes its due responsibility in the face of different scales.

3.4. Spatial Attention Mechanism Based on Statistical Analysis

The SAR ship target image is usually divided into two parts: the nearshore and the far sea. The most important feature of ship targets is that ship targets only appear in the ocean. There can be many objects on land exhibiting the same strong scattering effect as the target, which affects the accuracy of the ship detector. It is necessary to eliminate land interference to effectively detect ship targets. As mentioned above, the spatial attention mechanism is usually used to focus on the target area and eliminate the scattering disturbance near the ground and coast. However, the detection speed of the attention mechanism of traversing samples is limited. The contribution of this paper is the proposal of a spatial attention mechanism based on statistical analysis, which can improve the detection efficiency while ensuring the detection effect.

The ship image of the example SAR showed that the entire image had the characteristics of a large range of background information, and the amount of background information was much greater than the foreground information (ship target). When the neural network processes information, it treats background information and foreground information equally. In this case, there are no targets in the background (i.e., land), and the probability of large-scale sea targets in the open sea is also small. Targets generally appear in fixed routes and offshore areas. We designed an attention mechanism that was primarily used to learn the probability information of the target existing in the corresponding regional background and was applied in the front part of the large-scale neural network. At this stage, the attention mechanism based on statistical analysis was used to remove part of the background information, reduce the computing resources occupied by the background information, and reduce the operation of the corresponding area during gradient descent.

At the front end of the neural network, through feature. extraction, the network was trained to learn the importance of segmented regions. The probability of ship targets appearing in this area is calculated as follows.

Step 1: Divide the whole into small areas equal in size.

Step 2: Use convolution to build a neural network and count the probability of the occurrence of the target in this area. The value At is between 0 and 1. According to statistical analysis, some areas in a picture are defined as “unimportant” areas (for example, land, where the probability of target occurrence is At = 0.1), and the areas where targets appear are “important” areas (for example, ocean, where the probability of target occurrence is At = 0.7). Some area divisions include both ocean and land conditions; this usually occurs in a near-shore area, also marked as an “important” area. Based on statistical analysis, we found that “unimportant” regions are usually connected to “unimportant” regions.

Step 3: Map the calculation result of the statistical analysis of the grid back to the previous network and reflect the importance of the corresponding grid area in the original image.

Through such front-end processing, the calculation amount of the subsequent neural network is reduced, and the target detection neural network can focus the calculation on the area corresponding to the larger attention by adjusting the parameters. In this way, the detection effect is ensured, the detection efficiency is improved, and the computational burden of the neural network is reduced.

Using the above importance learning steps, the importance relationship between different regions can be found spontaneously through the weight matrix. The attention mechanism based on statistical analysis used in this paper initializes n groups of Q, K, and V matrices, where the value of n refers to the number of additional channels that form features. The expressions of the corresponding matrices are shown in Equations (1) and (2).

Q_{i} = W^{Q} * x_{i}, K_{j} = W^{K} * x_{j} i \neq j; i = 1, 2, \dots, n (n = k \times k)

(1)

ω_{i} = s i g m o i d (\frac{Q_{i} * K_{j}}{\sqrt{d}})

(2)

where

W^{Q} and W^{K}

are two different weight matrices and

x_{i} and x_{j}

are the corresponding input vectors, and

ω_{i}

is the weight of each target detection area after the sigmoid function is activated. The closer the value is to 1, the more important the block area is.

Therefore, the above method was used for the current target detection area divided by each block to obtain its importance relative to other areas, and finally, the weighting process was performed to obtain the overall weight. Figure 4 shows the structure of the spatial attention mechanism based on the statistical analysis in this paper.

4. Results

This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.

4.1. Ablation Experiment

Table 2 shows that the model using a layer of adaptive mechanism and data enhancement as the ship detection network had the best results, and the model precision, recall rate and average precision (AP) were 91.41%, 92.64%, and 93.85%, respectively. In general, using data augmentation and scale-adaptive loss structure on the adaptive mechanism structure worked better than only using the adaptive mechanism and only using a single data augmentation or scale-adaptive loss structure. Therefore, based on the analysis, we concluded that the performance of the ship target detection model using the adaptive mechanism and data augmentation in this paper was better than that of the traditional model and the model using only a single adaptive mechanism.

4.2. Comparative Experiment

Table 3 shows the comparison of the application of several models to the SSDD dataset. When applied to the 30% SSDD dataset, our model performed well in terms of both accuracy and AP: 13.35% higher than the SSD model, 12.1% higher than the YOLO v5, and 20.27% higher than the FPN model. In addition, our model performed the best in terms of the average detection rate AP, which was 92.92%. Our model still performed well when using 30% of the other datasets. On the 30% OpenSARShip dataset, the model accuracy rate was 91.72%, and the recall rate was 90.46%, which was slightly lower than that of the SSDD dataset. On the 30% HRSID dataset, the model performance was slightly worse, but it was more than 10% higher than that of the other models. This proved that our model was more suitable for ship target detection in SAR images.

Table 2 shows that the ship target detection model in this paper had the highest accuracy on the SSDD dataset. At the same time, the two indicators of recall and AP were also higher in each model. In regard to the OpenSARShip dataset, our model had the highest recall and AP. When applied to the HRSID dataset, the model effect was poor. We also used the single shot multibox detector (SSD) model, YOLO v5 model, and feature pyramid network (FPN) model on the SSDD dataset, and the effect was proven to be lower than that of the ship object detection model in this paper. Therefore, the model in this paper exhibited better performance and was better adapted to target detection than other models.

When using the 100% SSDD dataset, our model was comparable to other models in terms of both accuracy and AP, which remained at 95.88%. Our model performed well on the other two datasets, reaching slightly higher levels than the other models.

5. Discussion

Looking at the data, our model had better performance when applied to the same data. Especially when using 30% of the data, the advantage was obvious. We demonstrated that our model was still stable, even with a small amount of data, facilitating the practical generalization of the model.

Table 3 shows that compared to the results in the SSDD dataset, our model generally performed slightly worse on the OpenSARShip dataset. This was due to the characteristics of the dataset as shown in Table 1.

The performance on the HRSID dataset was also slightly inferior to that on the SSDD dataset, mainly because the HRSID dataset contained multiple targets per image. These targets were not only large in number but also small in scale. The background information of the images was also the most complicated. A considerably large area of the image was coast and land, and the image even contained many scenes of the interior of the river and the estuary. In some data images, the land area far exceeded the water area, increasing the difficulty of target detection.

In the above comparative experiments, our attention was mainly focused on the model detection effect using 30% of the data. The experimental results showed that the data enhancement method for SAR ship target data proposed in this paper can effectively reduce the data dependence with the regional attention pre-adaptive information extraction mechanism based on statistical analysis and the scale adaptive loss method. This mechanism was still stable when a small amount of data is used. It can be seen from the data that the effect using 30% of the data is much higher than other models. The SAR ship detection model designed in this paper has the same trend of detection effect on different types of public SAR ship data sets. It was verified from the comparison that our models show good performance.

6. Conclusions

In this paper, an enhanced data balancing method based on achieving a positive and negative balance is proposed to improve the performance of data. According to the characteristics of SAR images, scale-adaptive loss can improve the accuracy of ship target detection. In addition, the regional attention preadaptation mechanism based on statistical analysis is used to extract information, and scale-adaptive loss is used to improve the detection accuracy of the model. Through experiments, the accuracy of our model for ship target detection in SAR images was verified. When using the 100% SSDD dataset, our model performs slightly worse than other models, but the difference was not much. When using 30% of the data, our model performed well in terms of both accuracy and AP. We demonstrated that our model was still stable, even with a small amount of data. When using the same data, our model exhibited better performance. To achieve our design goal, when the amount of training data dropped to 30%, the results close to those based on 100% of the dataset were achieved. In summary, our model effectively reduces the data dependence of the model and improves the monitoring efficiency and accuracy simultaneously. It is expected that our model has high practical value in marine ship monitoring and provides some ideas for other special image data detection system designs. For ship target detection in SAR images, it is necessary to further explore how to improve the accuracy of ship target detection when the land area is larger than the water area. In view of the actual use environment, a detection model with more adaptability, a higher fault tolerance rate and more robustness warrants further exploration.

Author Contributions

Conceptualization, W.G.; writing—review and editing, W.G.; software, Y.Z.; visualization, Q.L. (Quanyang Liu) and Q.L. (Qi Li); project administration, Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Science and Technology Department Project of Jilin Province (under grants No. 20200404210YY and No. 20210502021ZP).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Written informed consent has been obtained from the patients to publish this paper.

Data Availability Statement

The SSDD product used in this work is available at: https://github.com/TianwenZhang0825/Official-SSDD (accessed on 1 February 2022). The HRSID product used in this work is available at: https://github.com/chaozhong2010/HRSID (accessed on 1 February 2022). The OpenSARShip product used in this work is available at: http://opensar.sjtu.edu.cn/; https://opensar.sjtu.edu.cn/DataAndCodes.html (accessed on 1 February 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

Moreira, A.; Prats-Iraola, P.; Younis, M.; Krieger, G.; Hajnsek, I.; Papathanassiou, K.P. A tutorial on synthetic aperture radar. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–43. [Google Scholar] [CrossRef] [Green Version]
Shao, Z.; Wu, W.; Wang, Z.; Du, W.; Li, C. SeaShips: A large-scale precisely annotated dataset for ship detection. IEEE Trans. Multimed. 2018, 20, 2593–2604. [Google Scholar] [CrossRef]
Born, G.H.; Dunne, J.A.; Lame, D.B. Seasat mission overview. Science 1979, 204, 1405–1406. [Google Scholar] [CrossRef] [PubMed]
Huang, L.; Liu, B.; Li, B.; Guo, W.; Yu, W.; Zhang, Z.; Yu, W. OpenSARShip: A dataset dedicated to Sentinel-1 ship interpretation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 11, 195–208. [Google Scholar] [CrossRef]
Li, B.; Liu, B.; Huang, L.; Guo, W.; Zhang, Z.; Yu, W. OpenSARShip 2.0: A large-volume dataset for deeper interpretation of ship targets in Sentinel-1 imagery. In Proceedings of the 2017 SAR in Big Data Era: Models, Methods and Applications (BIGSARDATA), Beijing, China, 13–14 November 2017. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, X.; Li, J.; Xu, X.; Wang, B.; Zhan, X.; Xu, Y.; Ke, X.; Zeng, T.; Su, H.; et al. SAR ship detection dataset (SSDD): Official release and comprehensive data analysis. Remote Sens. 2021, 13, 3690. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, X.; Ke, X.; Zhan, X.; Shi, J.; Wei, S.; Pan, D.; Li, J.; Su, H.; Zhou, Y.; et al. LS-SSDD-v1.0: A Deep Learning Dataset Dedicated to Small Ship Detection from Large-Scale Sentinel-1 SAR Images. Remote Sens. 2020, 12, 2997. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, X.; Shi, J.; Wei, S. HyperLi-Net: A hyper-light deep learning network for high-accurate and high-speed ship detection from synthetic aperture radar imagery. ISPRS J. Photogramm. Remote Sens. 2020, 167, 123–153. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, X. ShipDeNet-20: An only 20 convolution layers and< 1-MB lightweight SAR ship detector. IEEE Geosci. Remote Sens. Lett. 2020, 18, 1234–1238. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, X. High-speed ship detection in SAR images based on a grid convolutional neural network. Remote Sens. 2019, 11, 1206. [Google Scholar] [CrossRef]
Zhang, T.; Shi, J.; Wei, S. Depthwise Separable Convolution Neural Network for High-Speed SAR Ship Detection. Remote Sens. 2019, 11, 2483. [Google Scholar] [CrossRef] [Green Version]
Zhang, T.; Zhang, X.; Shi, J.; Wei, S.; Wang, J.; Li, J.; Su, H.; Zhou, Y. Balance scene learning mechanism for offshore and inshore ship detection in SAR images. IEEE Geosci. Remote Sens. Lett. 2020, 19, 1–5. [Google Scholar] [CrossRef]
Wang, Z.; Zeng, X.; Yan, Z.; Kang, J.; Sun, X. AIR-PolSAR-Seg: A large-scale data set for terrain segmentation in complex-scene PolSAR images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 3830–3841. [Google Scholar] [CrossRef]
Wang, Y.; Wang, C.; Zhang, H.; Dong, Y.; Wei, S. A SAR dataset of ship detection for deep learning under complex backgrounds. Remote Sens. 2019, 11, 765. [Google Scholar] [CrossRef] [Green Version]
Wei, S.; Zeng, X.; Qu, Q.; Wang, M.; Su, H.; Shi, J. HRSID: A high-resolution SAR images dataset for ship detection and instance segmentation. IEEE Access 2020, 8, 120234–120254. [Google Scholar] [CrossRef]
Iervolino, P.; Guida, R. A novel ship detector based on the generalized-likelihood ratio test for SAR imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 3616–3630. [Google Scholar] [CrossRef] [Green Version]
Li, M.-D.; Cui, X.-C.; Chen, S.-W. Adaptive superpixel-level CFAR detector for SAR inshore dense ship detection. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
He, J.; Wang, Y.; Liu, H.; Wang, N.; Wang, J. A novel automatic PolSAR ship detection method based on superpixel-level local information measurement. IEEE Geosci. Remote Sens. Lett. 2018, 15, 384–388. [Google Scholar] [CrossRef]
Albawi, S.; Mohammed, T.A.; Al-Zawi, S. Understanding of a convolutional neural network. In Proceedings of the 2017 International Conference on Engineering and Technology (ICET), Antalya, Turkey, 21–23 August 2017; pp. 1–6. [Google Scholar] [CrossRef]
Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards realtime object detection with region proposal networks. In Advances in Neural Information Processing Systems; Neural Information Processing Systems Foundation: Montreal, QC, Canada, 2015. [Google Scholar]
Kang, M.; Leng, X.; Lin, Z.; Ji, K. A modied faster R-CNN based on CFAR algorithm for SAR ship detection. In Proceedings of the 2017 International Workshop on Remote Sensing with Intelligent Processing (RSIP), Shanghai, China, 19 May 2017. [Google Scholar] [CrossRef]
Liu, Y.; Zhang, M.-H.; Xu, P.; Guo, Z.-W. SAR ship detection using sea-land segmentation-based convolutional neural network. In Proceedings of the 2017 International Workshop on Remote Sensing with Intelligent Processing (RSIP), Shanghai, China, 19 May 2017. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 9 November 2017; pp. 6517–6525. [Google Scholar] [CrossRef] [Green Version]
Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767v1. [Google Scholar] [CrossRef]
Jocher, G. YOLOv5. Available online: https://github.com/ultralytics/yolov5 (accessed on 1 April 2022).
Bochkovskiy, A.; Wang, C.Y.; Mark Liao, H.Y. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934v1. [Google Scholar] [CrossRef]
Singla, S.K.; Garg, R.D.; Dubey, O.P. Ensemble machine learning methods for spatiotemporal data analysis of plant and ratoon sugarcane. Intell. Data Anal. 2021, 25, 1291–1322. [Google Scholar] [CrossRef]
Lee, M.-H.; Yeom, S. Multiple target detection and tracking on urban roads with a drone. J. Intell. Fuzzy Syst. 2018, 35, 6071–6078. [Google Scholar] [CrossRef]
Chang, Y.-L.; Anagaw, A.; Chang, L.; Wang, Y.C.; Hsiao, C.-Y.; Lee, W.-H. Ship detection based on YOLOv2 for SAR imagery. Remote Sens. 2019, 11, 786. [Google Scholar] [CrossRef] [Green Version]
Kun, J.; Yan, C. SAR image ship detection based on deep learning. In Proceedings of the 2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC), Chongqing, China, 6–8 November 2020; pp. 55–59. [Google Scholar] [CrossRef]
Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 9 November 2017; pp. 2117–2125. [Google Scholar]
Chen, S.; Zhan, R.; Wang, W.; Zhang, J. Learning slimming SAR ship object detector through network pruning and knowledge distillation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 1267–1282. [Google Scholar] [CrossRef]
Ai, J.; Fan, G.; Mao, Y.; Jin, J.; Xing, M.; Yan, H. An improved SRGAN based ambiguity suppression algorithm for SAR ship target contrast enhancement. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Kisantal, M.; Wojna, Z.; Murawski, J.; Naruniec, J.; Cho, K. Augmentation for small object detection. Comput. Sci. 2019. [Google Scholar] [CrossRef]

Figure 1. Public SAR ship images: (a,b) are from the public dataset SSDD; (c) is from the public dataset HRSID; (d) is from the public dataset OpenSAR. For convenience of display, all pictures were stretched.

Figure 2. Common errors in ship target detection: (a) missed ship target; (b) false ship target.

Figure 3. Enhancement of ship target data: (a,c,e,g,i,k) images before data enhancement; (b,d,f,h,j,l) corresponding images after data enhancement.

Figure 4. Structural diagram of the spatial attention mechanism.

Table 1. Preliminary extraction results (Reprinted from Ref. [5]).

	Relative Len Error (%)	Relative Wid Error (%)	Absolute Len Error (m)	Absolute Wid Error (m)
VH	8.13	32.43	15.63	9.77
VV	7.33	41.10	14.22	11.51
Average	7.73	36.77	14.93	10.64

Table 2. Ablation experiment.

Networks	Dataset Size (%)	Precision (%)	Recall (%)	mAP (%)
Attention only	30	89.58	88.27	89.31
Attention (one layer) + data augmentation	30	90.73	90.59	90.87
Attention (two layer) + data augmentation	30	88.47	89.15	89.94
Attention (one layer) + loss	30	92.37	92.72	93.25
Attention (two layer) + loss	30	91.78	91.25	92.53
Attention (one layer) + data augmentation + loss	30	91.41	92.64	93.85
Attention (two layer) + data augmentation + loss	30	91.15	90.79	92.81
Attention only	100	90.91	89.84	92.89
Attention (one layer) + data augmentation	100	92.27	91.02	93.35
Attention (two layer) + data augmentation	100	91.62	92.24	93.10
Attention (one layer) + loss	100	90.45	93.83	93.79
Attention (two layer) + loss	100	92.76	92.58	93.52
Attention (one layer) + data augmentation + loss	100	93.84	92.81	94.27
Attention (two layer) + data augmentation + loss	100	92.59	91.21	93.96

Table 3. Comparative experiment.

Networks	Dataset Size (%)	Precision (%)	Recall (%)	mAP (%)
SSD + SSDD	30	77.23	78.16	79.57
YOLO v5 + SSDD	30	78.68	78.58	80.82
Our model + SSDD	30	90.17	91.14	92.92
FPN + SSDD	30	71.53	72.40	72.65
Our model + OpenSARShip	30	91.72	90.46	92.13
Our model + HRSID	30	91.51	90.48	91.96
SSD + SSDD	100	96.26	96.17	96.51
YOLO v5 + SSDD	100	96.53	96.38	96.60
Our model + SSDD	100	94.02	92.62	95.88
FPN + SSDD	100	94.46	93.20	95.93
Our model + OpenSARShip	100	92.67	93.84	94.95
Our model + HRSID	100	92.59	93.43	94.72

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gao, W.; Liu, Y.; Zeng, Y.; Li, Q.; Liu, Q. Statistical Adaptation Loss Improved SMALL Sample Ship Detection Method Based on an Attention Mechanism and Data Enhancement. Appl. Sci. 2023, 13, 2520. https://doi.org/10.3390/app13042520

AMA Style

Gao W, Liu Y, Zeng Y, Li Q, Liu Q. Statistical Adaptation Loss Improved SMALL Sample Ship Detection Method Based on an Attention Mechanism and Data Enhancement. Applied Sciences. 2023; 13(4):2520. https://doi.org/10.3390/app13042520

Chicago/Turabian Style

Gao, Wei, Yunqing Liu, Yi Zeng, Qi Li, and Quanyang Liu. 2023. "Statistical Adaptation Loss Improved SMALL Sample Ship Detection Method Based on an Attention Mechanism and Data Enhancement" Applied Sciences 13, no. 4: 2520. https://doi.org/10.3390/app13042520

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Statistical Adaptation Loss Improved SMALL Sample Ship Detection Method Based on an Attention Mechanism and Data Enhancement

Abstract

1. Introduction

2. Related Work

2.1. Object Detection Method Based on a Deep Convolutional Neural Network

2.2. Attention Mechanism

2.3. Data Augmentation

2.4. Introduction to the Generic SAR SHIP Dataset

3. Materials and Methods

3.1. The Dataset Used in this Study

3.2. Balanced Positive and Negative Data Enhancement

3.3. Scale Adaptation Loss

3.4. Spatial Attention Mechanism Based on Statistical Analysis

4. Results

4.1. Ablation Experiment

4.2. Comparative Experiment

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI