Ship velocity estimation in SAR images using multitask deep learning

SAR satellites are used for monitoring ships worldwide. Moving ships are Doppler shifted by an amount proportional to their velocity. An offset between the ship and its wake is then produced during SAR processing. We present a novel automatic method for calculating the ship velocity. The method relies on multitask deep learning to estimate the offset and ship heading. From these parameters, the ship velocity can be obtained. A convolutional neural network is trained using a coupled loss function. The loss function allows both parameters to be estimated at the same time. We show the methods’ effectiveness for ships in Sentinel-1 SAR images. For this purpose, a large dataset of 30,000 AIS annotated SAR ship images is collected. These images have 20 × 22 m pixel resolution and ships do not have a clear wake. The AIS provides the true ship velocity and allows the method to be evaluated. As a result, we can determine the ship speed with an accuracy of 1.1 m/s. The offset disappears near the azimuth direction of the SAR image. Yet, our method is reliable except for ships sailing within 2.5 degrees of the azimuth direction.


Introduction
Maritime surveillance from satellites has become increasingly important.The growing number of satellites and improved sensors enables more advanced data analysis.Larger ships must identify themselves by ship transponder systems for collision avoidance.One such system is the Automatic Identification System (AIS).It broadcasts the vessel identity, position, course over ground (COG), and speed over ground (SOG).The messages are relayed to ground stations or satellites, depending on the location.At high latitudes and open seas, messages can be days old and experience temporal gaps.In areas with high traffic, signals are frequently lost in data collisions (Ball, 2013;Høye et al., 2008).As vessels make ready for port call, transmissions slow down.AIS transponder use is more common on larger vessels, but varies by fleet and region.The AIS devices may be tampered with, or turned off, either by accident or deliberately.Dark ships are non-cooperative vessels that do not transmit AIS signals.They may be involved in criminal activities such as piracy, smuggling, oil spills, trespassing, or illegal fishing.These ships pose a risk to maritime traffic safety.
Dark ships can be detected in satellite imagery independent of AIS.But, a satellite image is only a snapshot in time.It is thus important to gain as much information about the captured vessels as possible.
Correct estimation of velocity and course is tantamount if one is to predict the whereabouts of a dark ship.There is an extensive literature on ship detection (Crisp, 2004;Tello et al., 2005), classification, and size estimation.Ship velocities are determined almost exclusively from the ship wake in satellite images.The velocity can be estimated from the wavelength of the Kelvin waves in both SAR (Graziano et al., 2016b) and optical images (Heiselberg and Heiselberg, 2017).Another method exploits the time delay between multispectral image acquisitions to estimate ship velocities (Heiselberg, 2019).A more common method for ship velocity estimation in SAR images is the Doppler offset.Movement perpendicular to the satellite gives rise to an offset.The ship then appears to be located either above or below its wake.The ship velocity is then estimated by measuring the distance from the ship to the wake vertex.
Wake detection requires complex methods.The wake appears as a contrasting dark or bright line which can be masked in high-sea, sea clutter and a noisy background (Graziano et al., 2016a).A clearly distinguishable wake is a necessity.This requires the ship to be sailing at a sufficient wake generating velocity.The linear components of the wake lines are generally detected using the Radon transform (Graziano et al., 2016a;Copeland et al., 1995;Courmontagne, 2005; Graziano https://doi.org/10.1016/j.rse.2023.113492Received 30 September 2022; Received in revised form 14 December 2022; Accepted 27 January 2023January et al., 2016b)).But have also been detected by application of a wavelet correlator (Kuo and Chen, 2003), a Hough transform based CFAR detector (Jiaqiu et al., 2011) and recently convolutional neural networks (CNNs) (Kang and Kim, 2019).Previous work typically separates the detection of ship and wake arms.The wake arms are followed to the wake vertex, from which the offset can be measured.The offset disappears when the ship is sailing parallel to the satellite orbit.Current methods are not reliable for ships sailing within 10 • of the parallel (Graziano et al., 2016b;Tunaley, 2003).
Estimation of the ship course primarily relies on the Radon transform.The 180 • ambiguity is subject to the isolation of the ship wake (Graziano et al., 2016b).Without the presence of wakes, the course estimation depends on the identification of the bow or stern.The SAR image resolution plays an important role.For smaller ships, the bow and stern can become almost indistinguishable.Previous SAR ship velocity analyses have predominantly focused on high resolution SAR imagery.Images in the X-band have more clearly identifiable bow, stern, and wake.The C-band Sentinel-1 SAR images are lower resolution.Clear wakes are thus generally only visible for fast ships (Graziano and Renga, 2021;Tings et al., 2021).When the wake is not as clear as in X-band SAR images, it becomes more difficult to estimate the ship course and offset.Yet, Sentinel-1 images has the advantages of higher spatial and frequent temporal coverage (Anon,0000).
In this article, we propose a novel, direct and fast method for estimating the ship velocity and course.It uses multitask learning and is developed for medium resolution C-band SAR images.And can be integrated in serial or parallel with current ship detection algorithms.It automatically measures the offset and ship course, and thus does not rely on prior wake detection.The method relies on a CNN trained using multitask learning.For this purpose, we construct a dataset consisting of 30,000 ship images.These are acquired from more than 300 VV+VH polarized Sentinel-1 SAR scenes.The images are annotated with the AIS COG, and SOG.A custom-made CNN is then trained to find the offset, course and thus velocity.The large dataset allows us to provide a detailed statistical analysis of ship velocity estimation.
The manuscript is organized as follows: The Sentinel-1 SAR and AIS data is described in Section 2, and the method in Section 3. Included in the method section is a description of the dataset, offset, CNN, multitask learning and training.In Section 4, the results are presented for offsets, COG, and SOG using a subset of 6000 ships from the dataset.The results are discussed in Section 5 with a comparison to previous work, where only 16 ships in total were analyzed.Finally, a conclusion and outlook is given.

Data acquisition
The Sentinel-1 satellites provide a wide selection of SAR products.We analyze the dual polarization level-1 Ground Range Detected High resolution (GRDH) scenes, in the Interferometric Wide Swath mode (IW).These contain a co-and cross-polarized channel and provide the highest resolution of 20 × 22 m oversampled to 10 m pixels.We have acquired more than 300 SAR scenes in the VV+VH polarization of areas with dense maritime traffic.All scenes were retrieved via the ASF DAAC (Anon, 2021).
AIS data is acquired corresponding to the spatial extent (footprint) of the SAR scenes.It was then filtered in a temporal interval from two hours before and after the sensing time (see Heiselberg et al., 2022 for details).The interval is chosen to guarantee ample amounts of AIS data on each side of the SAR scenes' recording time.AIS data points contain a Maritime Mobile Service Identity (MMSI) number, which is an identifier of the ship.It also provides a timestamp, latitude and longitude coordinates, speed over ground (SOG), course over ground (COG).The ship velocity and SOG are considered synonymous.We aggregate data with the same MMSI number to form a track of ship positions.The AIS data and tracks are shown in Fig. 1.

Methodology
The ship positions (x, y) were determined both by AIS, and in SAR images.Ships and AIS were subsequently matched as detailed in Fig. 2.

SAR and AIS positions
Ships and their positions were found in SAR images by a Continuous Wavelet Transform (CWT) based ship detection algorithm (Heiselberg et al., 2022;Du et al., 2006).The algorithm was not based on deep learning and thus did not require an existing dataset for training.It was applied to the more than 300 Sentinel-1 SAR scenes.The CWT detection algorithm convolved the image with a wavelet, and selected the SAR ship position at the highest wavelet response.This was often the protruding bridge.For container ships and oil tankers, the bridge is usually located at the stern, along with the AIS transponder.Examples of two SAR ships are shown in the insert of Fig. 1.They show the ship, sidelobes, and, in most cases, some clutter that extends up or down to the wake.Because the wakes were stationary and thus not Doppler shifted, the wake was expected to lie at the true ship position as given by AIS.
AIS signals contain a Maritime Mobile Service Identity (MMSI) number which identifies the ship, a timestamp, and latitude and longitude coordinates.Using the MMSI we aggregated the data and formed a spatio-temporal track of the ships.These tracks were then interpolated with a cubic spline as in Sang et al. (2012) to the recording time of the SAR scene.This (lat, lon) was converted to the range and azimuth pixel coordinates (  ,   ) in the SAR scene.The ship speed over ground (  ) and course over ground (  ) provided by the AIS messages were likewise interpolated.

Doppler shift
The azimuth shift of moving targets in SAR imagery has been studied thoroughly (Raney, 1971;Ouchi, 1985;Ouchi et al., 2002).It appears due to the different imaging processes in the range and azimuth directions.In the azimuth direction, the azimuth position of the target is located via the Doppler returns (Ouchi, 1985).The multilook processing of SAR gives rise to a time-lag between looks.A Doppler shift can then be introduced by a target with a velocity in the range direction.Movement in the azimuth direction, corresponds to a relative change in the SAR instrument velocity.This contribution can be considered negligible due to the velocity of the satellite.Consequently, the ships in the SAR scenes were Doppler shifted in the azimuth direction due  (1) The Doppler offset between   and   can be calculated using the AIS information as (see e.g.Palubinskas et al., 2009) where   = 7.4 km∕s is the Sentinel-1 satellite velocity,  is the angle of incidence between 29-46 degrees, and   is the slant range distance.The offset can be seen in Fig. 1 and is only present in the azimuth direction.By measuring the offset and ship course, the velocity can be obtained from Eq. ( 2) by referencing the already known SAR parameters.Ships sailing with a cruising speed around 8 knots were thus Doppler shifted up to ∼250 m (25 pixels) azimuthally up or down when sailing right or left in slant range direction respectively.We measured the azimuth offset from the SAR ship to the AIS signal where, for example, a positive offset indicates that the ship's SAR position was below the wake/AIS position and sailing between 0 • -180 • (towards the image right) and vice versa.

AIS-SAR association
The   were given relative to the North Pole and were rotated to the range/azimuth coordinate system of the SAR scene.This transformation depended on the inclination angle  = 98.18 • for the Sentinel-1 ascending orbit ( = 81.82• for descending) and the ship latitude  (see Heiselberg, 2019 for details) as ) .
(3)  The ship positions were matched to the Doppler offset corrected AIS ship positions as described in Heiselberg et al. (2022).An example of an AIS-SAR matched scene is shown in Fig. 3 with only a few AIS signals without a match to a SAR ship.Detections and AIS signals on land and in harbors were removed by using the GSHHG database (Wessel and Smith, 1996) for land-masking.
The detected Eq. ( 1) and calculated Eq. ( 2) Doppler offsets were very closely correlated as shown in Fig. 4, i.e.   ≃   .After azimuth correction, the AIS-SAR positional error distribution in were similar in both the range and azimuth (see Fig. 5).It could now be viewed as an inherent AIS uncertainty.The noticeable line in Fig. 4 at   = 0 is due to stationary ships and shows the AIS uncertainty distribution.

Dataset
Three hundred Sentinel-1 SAR scenes were acquired over a large variety of regions, such as the English Channel, the Gulf of Mexico, Oman, the Straight of Gibraltar, Singapore, Brazil, the Yellow Sea, South Africa, and more.Ships typically follow predetermined shipping routes.Moreover, the SAR scenes have varying background depending on the incidence angle.A diversity of locations was thus required to build a balanced dataset of ships sailing in multiple directions.We found that ship speeds rarely exceeded 10 m∕s (∼20 knots) and discarded 400 ships by requiring a maximum velocity of 30 knots.Some of these are rescue helicopters or airplanes with AIS transponders.The upper threshold removed outliers due to AIS errors, which also affect variances significantly.For all AIS associated SAR ship detections (see Fig. 3), an image was centered around it.The azimuth offset of ships sailing 30 knots is approximately 115 pixels and so images of size 256 × 64 × 2 (height, width, co-and cross-polarization) were selected.In total, 30,000 SAR ship images were collected from the SAR scenes.More than a dozen examples are shown in Fig. 6.The dataset was then created by combining each of the 16 bit ship images with the associated AIS (  ,   ,   ) and SAR values.

Convolutional Neural Network (CNN)
CNNs have become a common deep learning technique for learning and extracting complex features from images.The known input (SAR image) is mapped to an output by passing it through a series of convolution, pooling, and non-linearity (activation) operations.The output is compared to a known desired output, from which a loss is computed.Based on this loss, the model parameters (weights) of the CNN are adjusted accordingly.By iterative passing of images through the CNN the weights are adjusted, by minimizing a loss function.The CNN is then said to have learned to produce the desired output from the input.
For dark ships that do not transmit AIS signals, Eq. ( 2) becomes an equation with three unknowns (marked by superscript AIS): the velocity, COG and Doppler offset.The COG and Doppler offset are encoded in the SAR image as the ship orientation and wake position, respectively.By training a CNN to find the COG and Doppler offset values from the SAR image, the velocity can be obtained from Eq. ( 2) by utilizing the already known SAR parameters.
Various CNNs models have been used for ship and wake detection, ship classification, and ship-iceberg discrimination in SAR images.In this study, we created a CNN based on state-of-the-art principles to estimate two variables: the ship azimuth offset and course.The velocity could then be obtained from Eq. ( 2).It is an improved version of the best CNN developed in Heiselberg et al. (2022).Our proposed network uses a 7 × 7 convolution layer with stride 2, followed by a maxpool.This results in a 4x downsampling of the input image like the first layer of the ResNet (He et al., 2016).The CNN has 4 stages which after the first are preceded by a 2 × 2 maxpool with stride 2 and 1 × 1 convolution for further downsampling.The 4 stages consists of 3, 3, 9, 3 blocks respectively, conserving the 1:1:3:1 ratio suggested in Liu et al. (2022).The blocks comprised of a 3 × 3 depth-wise convolution followed by a 1 × 1 convolution and a residual connection as in Howard et al. (2017).A GELU (Hendrycks and Gimpel, 2016) activation function was applied after each convolution in the block.The number of feature parameters increased as 32, 64, 128, and 256 dependent on the stage, e.g., the blocks in stage 1 had 32 feature parameters.Batch normalization was applied immediately following all convolutional layers in the network.Instead of using dropout (Srivastava et al., 2014) layers to reduce overfitting, our CNN was regularized by stochastic depth (Huang et al., 2016).It increased linearly from 0 to 50%, e.g., the first block had a 0% chance of being skipped while the 18th 50%.An overview of the architecture is provided in Fig. 7.

Multitask learning
In order to train our CNN to estimate the azimuth offset, we first defined suitable loss functions.The azimuth offset error distribution was estimated by computing the residuals |  −    |.Likewise, the COG errors were estimated by interpolation of the AIS latitude and longitude coordinates to calculate    (Palubinskas et al., 2009).The residual errors |  −    | were then computed.These error distributions were both approximately exponentially distributed.The COG had a particular large tail of outliers.To avoid focusing on these outliers, we aimed at minimizing the mean absolute error (MAE) of both parameters.The loss functions were then and Multitask learning involves the weighted combination of each task loss into one.The CNN were tasked to estimate both the azimuth offset and the COG by normalizing and combining the losses of Eqs. ( 4) and (5).Training was dependent on the relative weighting of the losses.This avoided overfitting one parameter before optimal training of the other.Following Kendall et al. (2018) each loss was weighted with the uncertainty of the task.Each contribution were effectively normalized to the same size, The task uncertainties   = 7.31 and   = 34   |  −    |.Analogous to the COG, the SOG uncertainty   = 0.32 m∕s was estimated as the MAE of |  −    |.In practice, the COG was estimated by tasking the CNN to produce sin() and cos() by applying a hyperbolic tangent activation function.The angle was then recovered with the arc tangent function.It was also necessary to wrap the COG loss (Eq.( 5)) around 180 • .No activation was applied to the offset output.

Training
The dataset was randomly split into a training set consisting of 80% of the data and a testing set containing the remaining 20%.The training set was then subsequently further divided into a smaller training set and a validation set, again using an 80/20% split.By iterating over the training dataset, the images were sequentially mapped to the output as shown in Fig. 7. From the outputs, the losses of Eqs.(4), (5) were computed by referencing the true value given by AIS over a batch of  images.These losses were then combined via Eq.( 6) and used to adjust the parameters of the CNN.We used an AdamW (Loshchilov and Hutter, 2018) optimizer with default parameters and a batch size of  = 24.Any CNN with enough parameters can memorize a dataset instead of learning the desired properties.An outcome known as overfitting.Once the entire training set had been passed through the CNN, the network was evaluated on the validation dataset.The CNN had not seen the validation set, and the performance could thereby be gauged.After one such pass through the dataset, the process was repeated.Fig. 8 shows the validation loss of the CNN after each pass through the training dataset (epoch).We saved the state of the model at the epoch with the lowest validation loss.Because of this selection bias, the CNN was evaluated on the unseen test dataset.We did not apply any augmentation or normalization to the images during training.In the following section, results achieved by applying the trained model to the test dataset are presented.

Results
The trained CNN was applied to the ships in the test dataset to obtain the COG and Doppler offset.Using the already known satellite parameters and Eq.(2) the velocity was then calculated.
Table 1 lists the MAE of each estimated parameter for the networks trained using the single and multitask loss functions.In this section, we present the results of our proposed method and discuss the parameter estimation.A comparison with previous studies is provided in Section 5.

Doppler offsets
Relative to the SAR ship, both the AIS position,   , and   can also be construed as the location of the ship wake.The CNN algorithm is trained to find the wake and thus the offset from the SAR image alone.Where the azimuth offsets can range between ±75 pixels (750 m), the estimated offsets have a MAE of only 5.6 pixel.It should be emphasized that the CNN is given no AIS information about the ship position, offset, COG or SOG.The CNN algorithm can from the SAR image alone quite accurately predict the true position of the ship, wake, and thus azimuth offset.Fig. 6 displays 18 representative examples, where four positions are marked for each ship.These examples demonstrate how well the CNN algorithm works.The CNN can learn the wake structure and apply pattern matching to extract the wake position similarly to wavelet filters.This allows the wake to be identified even when it is almost hidden in sea clutter.Fig. 9 shows that the offsets follow the sine curve of COG, as predicted by Eq. (2).

COG
The COG is extracted directly from the SAR image by the CNN.Some ships appear pixelated, and it is thus difficult to exactly determine the COG by eye.Despite the resolution of the SAR images, the network can  accurately determine the COG within 20 • for faster ships (see Fig. 10).The CNN can automatically extract the ship orientation by e.g.approximating the Radon Transform.Combined with the wake position, this allows the COG to be precisely estimated.COG is determined by the direction in which the ship is sailing and not the ship heading (the direction the ship is pointing).These are almost interchangeable for faster ships, but can differ at lower velocities.Drifting becomes a factor for slower ships, and anchored ships spin based on the ocean currents.In these conditions, the COG may vary greatly from the heading.The COG error at lower velocities is thus expected to be higher, as also seen in Fig. 10.Because the offset and COG is strongly coupled, our method uses on the wake to determine the correct COG.This is demonstrated in Fig. 12 where the same ship is augmented.The estimated COG correctly follows the wake position despite clear bow/stern signatures.For slow but large ships without a wake as seen by the examples in Fig. 11 our algorithm instead determines the ship heading.Small ships on the order of the SAR image resolution have no clear orientation, or bow and stern features.Small and/or slow ships also generate little wake, unlike large and/or fast ships that generate a strong directional wake.There is thus no way for the CNN to determine the COG.
Several methods were tested for improving COG determination.Image rectification was studied in Fischer et al. (2015), where images were rotated randomly, and a CNN was used to estimate the angle of rotation.We tested their best methods such as: (i) using CNNs for quadrant proposal followed by 0-90 degree refinement; (ii) transforming the regression task into a classification problem with 5, 10 or 15 degree spacing; and (iii) Augmentation of the images by rotation or flipping.None of these methods improved our results.We believe that this is due to the strong coupling of the azimuth offset and COG, as shown in Fig. 12.

SOG
The final step is to calculate the ship velocity by inserting the CNN azimuth offset and COG in Eq. ( 2).In Fig. 13 the predicted   are shown vs. the true   values.The offset of Eq. ( 2) vanishes when sailing in the azimuth direction, and it is therefore impossible to determine the speed when   = 0 • or 180 • .This is shown in Fig. 14 where the SOG MAE increases sharply when ships sailing near the azimuth direction are included.To suppress these large errors, we follow Tunaley (2003) and Graziano et al. (2016a) and remove ships with COG below a threshold angle around the azimuth direction.The offset, COG, and SOG MAEs are listed in Table 1 for the COG threshold of 2.5 • around azimuth directions.
The SOG error is larger for the CNN calculations ∼ 1.1 m∕s than the basic error in the AIS data (∼ 0.3 m∕s) as shown in Fig. 15.In     13.At large speeds, the error increases, but as there are few ships, the statistics are poor.The SOG error decreases at slow speeds, but relative to the SOG it actually increases.This is related to the problems determining COG for slow ships as discussed above.
In Fig. 16 we look closer at multiple cases from the upper left part of Fig. 13, i.e.where CNN underestimates the SOG.These examples predominantly feature smaller ships without clear wakes.Due to the lack of wake generation and bow and stern characteristics, our CNN is not able to precisely determine the azimuth offset nor COG.We thus expect that our method will perform much better for higher resolution images.The resolution of the Sentinel-1 SAR images poses difficulties in velocity estimation.Small ships cannot be resolved by the 20 × 22 m pixels and appear as points with no orientation.Even less clear is the trace of their wake, if present at all.The offset and COG estimation, dependent on both the wake and the ship orientation, is bound to be more erroneous.This then leads to the errors shown in Fig. 16.No clear solution arises to this issue, as it is a result of the SAR instrument.Instead, we suggest modeling the parameter distribution to provide an error estimate.Assuming a Gaussian distribution, the mean, and variance of the offset would instead be estimated by the CNN.Eq. ( 4) would then minimize the negative likelihood of observing   .An end-user would then be aware of the variance associated with the estimation.Another way of improving the SOG would be to omit smaller ships.Yet, we aimed at an analysis for ships of all sizes in medium resolution Sentinel-1 images.

Single-vs multitask CNN
We compared the multitask model with two singletask CNN models.These were trained to determine the azimuth offset using the loss  4) and the COG of Eq. ( 5) separately.Our results show that, despite similar azimuth and COG errors (see Table 1), the SOG error is more than 20% higher for the singletask models.The multitask model learns the coupling between the azimuth offset and COG, e.g. for COGs between 0 • -180 • the offset is positive and negative for 180 • -360 • .This allows the model to easily overcome the 180 • ambiguity.The coupling is evident in Fig. 12 and is a direct consequence of training the CNN with the multitask loss of Eq. ( 6).The multitask loss effectively gives the algorithm a ''hint'' that the azimuth offset and COG are coupled.Furthermore, the CNN is forced to share weights to estimate both parameters, which regularizes the model.This allows the multitask model to outperform the two separate models, which was also found in Kendall et al. (2018).Our proposed method automatically combines the information of bow, stern, and wake position for COG and azimuth offset estimation.Therefore, it becomes more robust for applications to lower resolution SAR images where the bow and stern can be indistinguishable.Moreover, the multitask method is twice as fast compared to the singletask.

Discussion
There are only few previous similar analyses that can be compared to our results.They mostly use higher resolution TerraSAR-X images, but with no AIS data for ground truth validation of position, COG and SOG.

COG estimation
The high resolution of TerraSAR-X, Radarsat, and COSMO/SkyMed can better identify both ship bow, stern, and wake.In previous studies, the COG was determined as a precursor to velocity estimation.The radon transform was utilized, resolving the 180 • ambiguity by correct isolation of the ship wake above or below the ship.These methods rely on high ship velocity for wake generation and high image resolution for wake distinction.The COG estimates are, as noted in Tunaley (2003), either correct within ±1 • , indeterminate, or in a few cases completely off.Unfortunately, we were not able to find another study with ground truths.
The ship wake is still visible in the Sentinel-1 images, but does not allow for wake isolation as in the TerraSAR-X images.Thus, the ship bow and stern characteristics are necessary for correct COG estimation.Our method is likely to produce more accurate results when features are easier to discriminate in comparison to the images of Fig. 6.

SOG estimation
Previous studies consist only of a combined total of 16 ground truth validated velocity estimates.These were selected from ideal cases with high resolution SAR images.Wake vertex extraction is a necessity in the method proposed by Kang and Kim (2019), Tunaley (2003), Graziano et al. (2016b,a).The azimuth offset is then be measured as the distance from ship to vertex.This requires first detecting and isolating the wake.In Kang and Kim (2019) a wake bounding box was estimated using a CNN.After ship detection, Graziano et al. (2016b,a) masked the detected ship, leaving only the wake.A grid search around the ship in the radon domain was then carried out (Tunaley, 2003).Different processing steps were then applied to enhance the wake.Finally, the wake vertex position was determined using the Radon transform.Yet, the vertices could not be separated when ships sail close to the azimuth direction.Thus, Tunaley (2003) set a threshold within 10 • of the azimuth and Graziano et al. (2016b,a) at 15 • , which is up to 1∕6 of possible sailing directions.An azimuth threshold was not mentioned in Kang and Kim (2019), but a velocity of at least 1 m∕s, to allow for wake generation, was required.Such a rule is likely to exists for all studies relying on the clear wake isolation.The methods mentioned consists of a series of steps.First the wake is detected, then the wake vertex is isolated, etc.Failure in any of these steps results in an indeterminate velocity.
In Kang and Kim (2019) the velocity estimation was carried out for five of the 189 ships studied.A MAE of 0.18 m∕s was achieved in TerraSAR-X satellite images (3 × 3 m).One ship sailed at, 2.79 m∕s while the remaining four sailed close to or faster than 5 m∕s.Along Track Interferometry (ATI) utilizing the TanDEM-X constellation was also applied to the five ship wakes, yielding a MAE of 0.72 m∕s.The few ships indicate that many velocity estimates were indeterminate, as the testing set made up only 2.6% of the data.Canadian Coast Guard reporting points were used in Tunaley (2003) for validation.A MAE of 1.3 m∕s was found for three ships, all sailing faster than 5 m∕s in Radarsat images (8 × 8 m).Seven ships were surveyed in Graziano et al. (2016b) using AIS as ground truth.A MAE of 0.89 m∕s in both COSMO/SkyMed (3 × 3 m) and TanDEM-X Co-registered Single Look Slant Range Complex Experimental bistatic acquisition (1.2 × 6.6 m) images.Twenty-four ships in the study had indeterminate velocities due to the azimuth threshold.The remaining ships all sailed at velocities faster than 4.9 m∕s.Only one ship sailing at 4.9 m∕s was validated in Graziano et al. (2016a) with a difference of 0.31 m∕s in a TerraSAR-X image.
Our method does not rely on the clear isolation of the wake and is not subject to indeterminate velocities.It directly identifies the ship and wake vertex position to automatically compute the azimuth offset.Furthermore, we can estimate velocities without a significant increase in error down to an azimuth threshold of 2.5 • (see Fig. 14).For ships sailing at equal velocities, our results are comparable to Tunaley (2003), Graziano et al. (2016b) and are, on average, only off by a couple of knots compared to Kang and Kim (2019), Graziano et al. (2016a).This is despite the about 5 times lower image resolution, used in this study.The low achieved MAE shows the effectiveness of the previously proposed methods.But, a large part of their success owes to the high resolution COSMO/SkyMed, TerraSAR-X, or Radarsat images used.Here, we presented a robust method that is also applicable to lower resolution SAR images.

Dataset and training
Our dataset consisted of 30,000 AIS annotated ship images from 300 SAR scenes.The scenes were of many locations with different dominant sailing directions.This allowed us to create a large dataset with many backgrounds and incidence angles.The size of the dataset enabled a test dataset with 6000 ships.Our model was thus trained and tested on ships sampled globally.The transferability to other regions is thus ensured by design.Even though the test and training set were randomly partitioned, the size allowed for both sets to maintain a similar distribution of both   and   .Large SAR ship datasets already exist, such as the SSDD (Zhang et al., 2021), LS-SSDD (Zhang et al., 2020), and OpenSARShip (Huang et al., 2018).
Yet, none of these provide the adequate image size or COG and SOG annotations.
In rare cases, the images contained more than one ship.We did not differentiate between these images and those of only a single ship.Yet the target ship was always in the center of the image.The CNN was left to learn how to extract the desired parameters from the correct ship.
The dynamic range of SAR images is generally large, and the images used in this study had a 16-bit depth.These types of images are usually normalized with the base 10 logarithm and then to the 0-1 value range (Bentes et al., 2016) before being passed through the CNN.We did not find this affected our results, and only slowed down the training.It should be noted that modern autograd libraries can handle 16-bit unsigned integers.There was thus no need for normalization, and none was applied.
In deep learning, image augmentations, such as flipping, rotating, and shifting, are commonly applied.This helps regularize the model and increases robustness as the model is subject to more variation.Yet, augmentation must be applied with caution when estimating the azimuth offset and COG.For COGs between 0 • -180 • the offset is positive and negative for 180 • -360 • .Not all augmentations (see Fig. 12) occurs naturally in the SAR images.We found that applying these augmentations increased the SOG error, consistent with what we expected.Augmentation by vertically and horizontally flipping the image is possible.But, we did not find it changed the results significantly, likely due to the size of the dataset.

CNN
The goal of this study was not to analyze deep CNNs, complex architectures, or tweak hyperparameters.Our analysis confirm the general experience with machine learning that it is data driven rather than model driven.We therefore focused on the physical aspects of SOG determination in Sentinel-1 SAR data.A CNN was constructed, that a performed well on that data and allowed for fast training, which further enabled experimentation.The primary decisions are discussed in the following and SOG errors are in reference to the 2.5 • threshold as in Table 1.Initially, we attempted to modify the model suggested in Heiselberg et al. (2022) which was among the best in discriminating SAR ships from icebergs.Training was slow for that model, as it did not use initial downsampling, because it was designed for smaller 75 × 75 images.Instead, we tested numerous popular models.These are regularized via deep and thin architectures (He et al., 2016) and designed for huge datasets and complex classification tasks.These models had many parameters that led to long training times and resulted in overfitting.As an example, the ResNet18 achieved a SOG MAD of 1.2 m∕s ( > 2.5  ) despite having more parameters and longer training time.The proposed model was at first based on the lessons learned in Liu et al. (2022).It uses a ''patchify'' 4 × 4 stride 4 convolution stem and 2 × 2 stride 2 convolutions for downsampling.The model is thus fully convolutional, as proposed in Springenberg et al. (2014).We swapped the initial ''pathchify'' layer with that of He et al. (2016).This led to a 0.05 m∕s SOG MAD decrease.Replacing the convolutional downsampling with a strided maxpool operation led to a further decrease of 0.1 m∕s.This may be due to the regularizing effect of fewer trainable parameters.We also attempted to regularize the network by applying dropout, but found it worked poorly in combination with batch normalization.Stochastic depth proved a superior way of regularizing the network, reducing SOG MAD by 0.12 m∕s.These changes were the most significant in decreasing the SOG MAD.Below are further modifications which had less effect but simplified the model and aided training.The blocks of Liu et al. (2022) were swapped with the blocks suggested in Howard et al. (2017), and though both use depthwise convolutions, we found the latter to produce better results.The aggressive initial downsampling, few features, and depthwise convolutions led to a significant decrease in parameters, computations, and thus training time.Replacing the layer normalization recommended in Liu et al. (2022) with batch normalization was detrimental for training and regularization.
The model was tweaked such as to achieve the best validation results in both single and multitask.Since the weights are shared in the multitask case, it may provide an added regularizing effect that improves the multitask model.The MAE for both offset and COG, shown in Table 1, are similar for single-and multitask training.Yet the SOG is not owing to how multitask learning teaches the model the coupling between offset and COG.

Conclusion
This study proposes a novel method of estimating the azimuth offset, and course of ships.Combined with already known SAR parameters, the ship velocity is then obtained.The method relies on a CNN trained and tested on a dataset of 30,000 Sentinel-1 SAR ship images.These images have a medium resolution of 20 × 22 m.The lower resolution SAR images are more abundant and cover larger areas.Yet, the V-shaped ship wake is not clear.Velocity estimation by wake detection is thus difficult.The proposed method eliminates the need for separate wake detection.This study uses a multitask loss to estimate two variables at the same time.The multitask learning enhances the predictive power of the CNN.It does so by hinting that the azimuth offset and course are related.A natural extension of this work is to estimate more ship parameters.The ship length and width can be calculated by adding the appropriate terms to the loss function.Multitask learning may also improve ship detection and classification tasks.
Estimation of the velocity and course of dark ships is crucial for maritime safety.It is difficult to survey remote areas, such as the enormous Arctic and North Atlantic.Satellite based methods are thus important for nations with limited resources.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Sentinel-1 SAR image of Cape Town, South Africa.The image is overlaid with AIS data ranging in color from blue (before) to red (after) the recording time of the SAR image.Two vessels can be seen in the insert sailing in opposite directions.They are displaced up and down in the azimuth direction.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 3 .
Fig. 3. AIS-SAR association performed on a scene of the Yellow Sea, south of Rongcheng, China.Green circles indicate matches between AIS and SAR.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 6 .
Fig. 6.SAR ship detection image examples.The blue circle shows the SAR position of the ship, the yellow circle the estimated azimuth offset according to Eq. (2) relative to the ship position, the green circle the AIS signal position, and the red circle shows the CNN predicted wake position.Arrows follow the same color scheme and show ship heading.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 7 .
Fig. 7.The architecture of the convolutional neural network.The input image with dimensions 256 × 64 × 2 is mapped to two outputs.

Fig. 8 .
Fig. 8.The validation loss during training.One epoch is equal to one pass through the dataset.The losses are given by Eqs.(4), (5), and (6).

Fig. 9 .
Fig. 9. Scatter plot of the offset   vs.   .The color scale depicts the difference between estimated and actual SOG.Lines show estimated azimuth offset based on selected SOG values.

Fig. 10 .
Fig. 10.COG MAE vs. SOG.Vertical lines denote SOG bins.The AIS error is calculated similar to   as in Section 3.6.

Fig. 11 .
Fig. 11.Examples of erroneous COG estimation.An explanation of the markers is given in Fig. 6.

Fig. 12 .
Fig. 12. Offset and COG estimation for the same augmented ship image.In order left to right; original image, vertically flipped, horizontally flipped, vertically and horizontally flipped.An explanation of the markers is given in Fig. 6.

Fig. 13 .
Fig. 13.Scatter plot of the velocity   vs   .The color scale depicts the difference between estimated and actual COG.

Fig. 15 .
Fig. 15.SOG MAE vs SOG.Vertical lines denote SOG bins.The AIS error is calculated as   in Section 3.6.

Fig. 15
Fig.15the blue patch of cruising ships corresponds nicely with the SOG MAE dip of Fig.13.At large speeds, the error increases, but as there are few ships, the statistics are poor.The SOG error decreases at slow speeds, but relative to the SOG it actually increases.This is related to the problems determining COG for slow ships as discussed above.In Fig.16we look closer at multiple cases from the upper left part of Fig.13, i.e.where CNN underestimates the SOG.These examples predominantly feature smaller ships without clear wakes.Due to the lack of wake generation and bow and stern characteristics, our CNN is not able to precisely determine the azimuth offset nor COG.We thus expect that our method will perform much better for higher resolution images.The resolution of the Sentinel-1 SAR images poses difficulties in velocity estimation.Small ships cannot be resolved by the 20 × 22 m pixels and appear as points with no orientation.Even less clear is the trace of their wake, if present at all.The offset and COG estimation, dependent on both the wake and the ship orientation, is bound to be more erroneous.This then leads to the errors shown in Fig.16.No clear solution arises to this issue, as it is a result of the SAR instrument.Instead, we suggest modeling the parameter distribution to provide an error estimate.Assuming a Gaussian distribution, the mean, and variance of the offset would instead be estimated by the CNN.Eq. (4) would then minimize the negative likelihood of observing   .An end-user would then be aware of the variance associated with the estimation.Another way of improving the SOG would be to omit smaller ships.Yet, we aimed at an analysis for ships of all sizes in medium resolution Sentinel-1 images.

Table 1
MAE for Offset, COG and SOG.