Statistical modeling of Ship’s hydrodynamic performance indicator

Abstract The traditional method used to estimate the hydrodynamic performance of a ship uses either the model test results or one of the many empirical methods to estimate and observe the trend in fouling friction coefficient ( Δ C F ) over time. The biggest weakness of this method is that the model test results as well as the empirical methods used here is sometimes not well-fitted for the full-scale ship due to several reasons like scale effects and, therefore, this method may result in an inaccurate performance prediction. Moreover, in the case of a novel ship design, it would be nearly impossible to find a well-fitting empirical method. The current work establishes a new performance indicator, formulated in the form of generalized admiralty coefficient with displacement and speed exponents statistically estimated using the in-service data recorded onboard the ship itself. The current method completely removes the dependence on empirical methods or model test results for the performance prediction of ships. It is observed here that the performance predictions using the current method and the traditional method are based on the same underlying logic as well as the results obtained from both the methods are found to be in good agreement.


Introduction
The in-service data recorded onboard a ship can be significantly instrumental in accurately estimating the operational performance of the ship but it comes with an inherent problem. The operational performance of a system can be easily evaluated over time by comparing the observed operational value with a previously recorded value. It is utmost important that these two values must belong to the same operational condition so that they can be considered comparable. It is very difficult to achieve this in the case of a ship as the recorded data is not only affected by the weather but it is spread over a wide range of speeddisplacement operational domain of the ship.
In a simple attempt to monitor the operational performance of a ship, Walker and Atkins (2007), proposed observing the increase in power demand of the ship at a fixed speed and displacement (or loading condition). This kind of practice is quite feasible for defence ships but is rather impractical for merchant ships due to, for instance, variation in displacement between individual runs. Another solution to this problem is to do an in-direct comparison between the old and the new value using a benchmarking curve (or surface) which takes into account the variations due to speed and displacement. A conventional benchmark for a ship's operational performance is its calm-water speed-power curve. Using the operational data recorded onboard a ship it is possible to regenerate this curve for a range of displacements, resulting in a speed-power-displacement surface which can, then, be used to monitor the performance of the ship, as we aim to demonstrate in this paper.
The aim of the current work is to establish a simple performance indicator which can be used to monitor the hydrodynamic performance of a ship using the in-service data recorded onboard it. As it may be known, the well-known admiralty coefficient is sometimes used as a hydrodynamic performance indicator for a ship. In view of that, the paper begins with an extensive literature review of the admiralty coefficient and the relationship between shaft power (P s ), speed-throughwater (V) and displacement (Δ). Based on the review, a generalized form of admiralty coefficient is proposed and fitted on the in-service data recorded onboard a ∼ 200m ship over a duration of about 3 years. The in-service data, used for fitting the model, is corrected to remove the effect of environmental loads and marine fouling. The obtained generalized admiralty coefficient is, then, demonstrated to be used as a performance indicator for the ship. The predicted performance is, finally, validated by presenting a thorough comparison of the current method with the traditional performance prediction method for a ship, i. e., observing the trend in fouling friction coefficient (ΔC F ).

Ship performance indicator
As aforementioned, the admiralty coefficient (Δ 2/3 V 3 /P s ) is sometimes used an a hydrodynamic performance indicator for a ship in service. This is due to the fact that it is believed, by some, that the admiralty coefficient summarizes the relationship between the speed, power and displacement of a ship and provides a scalar value which can be compared to its future value very conveniently. The basic assumption here is that the value of admiralty coefficient is assumed to remain constant for a ship over the whole range of operational domain, i.e., for all speed-displacement combinations, of course only in calm-water condition. This assumption has been contradicted with evidence by several researchers. The following section gives a complete overview of the admiralty coefficient, its history and the criticism that it has received in the marine research community.

Historical overview
In the earliest stages of research and development in the field of ship design, ship model experiments were used to explore the design space. The information obtained from these experiments was stored in a concise manner. This information storage system gradually led to the development of different data presentation systems as well as now wellknown empirical relations, for example, Froude number (named after William Froude, an English naval architect working at Admiralty Experiment Works (AEW), England). Around 1878, B. J. Tideman, a naval engineer from Netherlands, introduced the concept of nondimensional presentation of ship model resistance data by presenting his model test results as resistance per displacement (R /Δ) plotted against speed per sixth root of displacement (V/Δ 1/6 ) (Telfer (1963)).
Almost 10 years later, in 1888, R. E. Froude, succeeding William Froude, published the so-called "Constant System of Notation" (Froude (1888)). The constant system attempted to standardize the ship model resistance data presentation system using some non-dimensional constants. R. E. Froude, probably from his knowledge and experience, here introduced the admiralty constant defined as Δ 2/3 V 3 /e.h.p., where e.h.p. is the effective horsepower. The idea was to plot the inverse of admiralty constant ( ) against non-dimensional ship length ( ) with discreetly varying values of non-dimensional ship speed ( ). Such iso-curves, also known as presentation, were used by ship designers to obtain an optimal ship design.
The -presentation became quite popular but it was not accepted by all. Telfer (1963), criticized this presentation as being "schizophrenic", arguing that it shows one thing and generally means exactly the opposite. He stated that inspection of any isosheet will invariably show that the reduction of requires an increase of which is due to the fact that a model of constant length having a smaller and smaller displacement was being run at a lower and lower speed in relation to its length. Moreover, it was argued that the penalty for "stumpiness ", i.e., low value was, in fact, false as the longer ships will have higher wetted-surface area and, therefore, increased frictional resistance. In order to fix this issue, Telfer (1963), proposed a new system of presentation, called R c V c presentation (demonstrated by Doust and O'Brien (1959) , R is the ships total resistance and L is the ship length. Telfer (1963), also presented an insight into the logical derivation of (which might or might not have been used by R. E. Froude) and R c as follows: Here, it should be noted that the former seems to be based on the same non-dimensional presentation as proposed by B. J. Tideman while the later uses William Froudes non-dimensional speed instead, i.e., Froude number. Additionally, Telfer (1963), stated that no merchant ship is ever designed to operate on a resistance varying as the square of the speed, or the power varying as the cube, i.e., the value of exponent n, used as 2 in both the presentations, can be taken as 3 for merchant ships. But unfortunately, no evidence proving the same was provided with the argument.

Admiralty coefficient: A performance indicator?
The admiralty constant (Δ 2/3 V 3 /e.h.p.) as well as the resistance constant (R c = RL/ΔV 2 ) (proposed by Telfer (1963)) were used to model the variation in hydrodynamic performance of different ship designs. Therefore, it is quite obvious that they used only the design point values of the included parameters but not the whole range of operational domain. In other words, the speed (V), displacement (Δ), and so on, were only actually the corresponding design point values. Thus, it was never intended to use these constants to monitor the operational performance of an individual ship but rather just compare the hydrodynamic performance of different ships in their respective design conditions. It is also noteworthy that the originally proposed admiralty constant (Δ 2/3 V 3 /e.h.p.) was a function of "effective" horsepower (e.h.p.) while the modern admiralty coefficient uses shaft power (P s ) instead (ITTC (2017)). Thus, the two, clearly, differ by the factor of propulsive efficiency of the ship, which is known to be varying for different operational conditions. Thus, it can be clearly concluded that the originally proposed admiralty constant or any of its variations were actually neither intended nor proven to be an appropriate operational hydrodynamic performance indicator for a ship. It was rather developed to compare the hydrodynamic performance of different ship designs. In any case, the idea of summarizing the calm-water speed-power curve into a singular or a very few constant values can still be realized using a simple statistical analysis of the operational data recorded onboard a ship, as demonstrated in the current work. In order to do so, a thorough literature survey is presented in the following sections to understand the relationship between shaft power (P s ), speed-through-water (V) and displacement (Δ).

Speed exponent (n)
The relationship between speed (V) and power (P s ) is widely accepted as P s ∝V n , with n = 3 according to the admiralty coefficient. From the physics point of view, the value of n = 3 is quite appropriate for low speed range when the total resistance coefficient remains constant (and therefore, independent of ship speed) due to negligible wave resistance. Kristensen (2010), used a computer model based on updated Guldhammer and Harvald's method (Kristensen and Bingham (2017)) to estimate the value of n for container ships of different sizes and service speeds. He concluded that the cubic relationship is only valid for container ships in low speed range, Froude number (Fn)≲0.18, for higher speed range n can vary from 3 to 7.
In a very recent study, Taskar and Andersen (2019a), used a detailed model of ship performance to investigate fuel savings due to speed reduction for 6 hypothetical ships. The ship performance model, based on updated Guldhammer and Harvald's method, was also used to study the speed exponent n. It was concluded that n is a function of ship size, type (or hull shape) and speed of operation. Based on that, Taskar and Andersen (2019a) presented n as a function of Froude number (Fn) for all the 6 ships. In addition to that, they calculated a constant averaged value of n by curve fitting the speed-power calm-water data assuming P s ∝V n . The value of n was observed to be increasing substantially (ranging between 3 to 6) above a certain Fn (depending on ship type and size) and the constant averaged values were found to be in the range of 3.3 to 4.2.
In a slightly different domain, several researchers used full-scale operational data from sea going ships to calculate the speed exponent with respect to bunker consumption. This speed exponent is further used to estimate the bunker consumption of the ship during a sea voyage. Such an estimation forms the basis of several maritime transport models used for various different purposes. The speed exponent with respect to bunker consumption will not be exactly same as n as the Specific Fuel Consumption (SFC or SFOC) of a marine engine varies with its load. But since this variation is very small, this speed exponent can be assumed to be equal to n (demonstrated by Taskar and Andersen (2019a)). Wang and Meng (2012) estimated that the speed exponent with respect to bunker consumption for three types of container ships (3000, 5000 and 8000-TEU) using regression analysis of full-scale data from a global liner shipping company. The values of speed exponent were obtained in the range of 2.7 to 3.3. Du et al. (2011), following the recommendations of engine manufacturer MAN-Energy-Solutions (2004), used the speed exponent as 3.5, 4.0 and 4.5 for feeder, medium-sized and jumbo container ships, respectively, to calculate the bunker consumption. Psaraftis and Kontovas (2013) reviewed 40 ship-speed-based models used in maritime transportation for various purposes like weather routing, scheduling, cost optimization, fuel management, and fleet deployment. 25 models out of these 40 were established with a cubic speed exponent assumption.

Displacement exponent (m)
Unlike the speed exponent (n), not much attention has been paid to the displacement exponent, m = 2/3 according to admiralty coefficient, formulating the relationship between power (P s ) and displacement (Δ). The displacement exponent in admiralty coefficient is very commonly used to correct sea trial data but only in the limit that the difference between the trial displacement and the required displacement is less than 2% of the required displacement (ITTC (2017)). Thus, assuming m = 2/3 for the whole range of displacements for a ship does not seem reliable. Tu et al. (2018), derived a new admiralty coefficient to improve the reference speed estimation for EEDI calculations for container ships. The reference speed for EEDI calculations is, generally, obtained by using the speed-power-displacement relationship given by the admiralty coefficient. Tu et al. (2018), argued that the fixed displacement exponent (m) in admiralty coefficient should be replaced by a function of the ship's hull form coefficients, namely, block coefficient (C b ), prismatic coefficient (C p ) and water-plane area coefficient (C w ). Tu et al. (2018), used the model test data of 4 container ships to calculate the new exponent using regression analysis and formulated m = 1 − 2(C b +Cw)

3
. The results were obtained by using the design point values only. Thus, the results may be useful to compare different ship designs but does not seem to be valid for different draft values for the same ship.
From our physical understanding of ship hydrodynamics, it is important to realize that the displacement of the ship is not really a direct influencing parameter but rather it is being used to summarize several highly influential parameters, like the wetted-surface area and the water-plane area. The change in displacement produces a change in these influential parameters and therefore, results in the change of operational characteristics of the ship. This change significantly influences the speed-power calm-water curve, thereby stretching it in a third dimension which we are modelling using displacement. Now, fitting a constant exponent (m) over the whole range of this new dimension assumes that the trend along this dimension is continuous and follows the curve Δ m .
The merchant ship hull forms are now-a-days optimized for best hydrodynamic performance in design draft condition by introducing features like transom stern and bulbous bow. The change in displacement also produces change in transom stern immersion and bulb immersion. The same is the case of propeller immersion, which is known to be a very influential parameter in ship hydrodynamics (Prpić-Oršić and Faltinsen (2012)). These factors would have an additional influence on the displacement exponent (m). Thus, the assumption of continuity and uniformity will probably not be valid over the whole range of displacement. Moreover, acknowledging the fact that different combinations of draft and trim may result in same displacement, and variation in speed may also influence the value of m introduces an additional complexity to the problem.
As in the case of speed exponent (n), the above discussion clearly indicates that the actual value of m over the whole range of displacement, most likely, would not be constant, and it may also vary due to the variation of ship speed for the same displacement range. But it may still be possible to either model an averaged constant value of m over the whole domain or obtain several values of m by piece-wise fitting the trend. The latter would definitely produce better results but the former would be more feasible to implement, keeping in mind the fact that the available data is generally limited to only a handful of displacement values. In general, commercial ships like bulk-carriers and tankers operates either around the full-load or the ballast displacement for most of their voyages. Thus, it would be most advantageous to establish at least 2 values of m around these two displacement ranges.

Generalized admiralty coefficient: A performance indicator?
From physics point of view, the generalized admiralty coefficient defines a log-linear relationship between speed-through-water (V), shaft power (P s ) and displacement (Δ) as follows: Introducing a proportionality constant (p ′ ) and taking logarithm on both sides results in a linear equation as follows: Fitting this relationship on the in-service calm-water data would results in the equation of a flat surface in log scale, representing the calm-water speed-power-displacement surface for the specific ship (consisting of speed-power calm-water curves at all the possible displacements). The exponents m and n can be statistically calculated using an ordinary least squares (OLS) regression. The in-service operational data, used to obtain the exponents, can be filtered for near-calm-water conditions, for instance, by limiting the wind speed and significant wave height below a certain critical value. It may be argued that even the remaining small variation due to the environmental loads in the filtered data may result in a bias in the estimates. An obvious solution to that would be to correct the measured shaft power to account for environmental loads using available physics-based (or empirical) methods. For the current work, both the near-calm-water filtered data and the filtered data with correction applied is used to obtain the results in order to assess if applying the environmental load corrections is really necessary.
The above method would provide an averaged constant value of m and n over the whole operational domain of the ship. Also, from Eq. 3, it can be clearly confirmed that the numerical value of the generalized admiralty coefficient (Δ m V n /P s ) is nothing but the exponent of the intercept (e − p = 1/p ′ ) of the fitted speed-power-displacement surface on the power axis (in log scale). Moreover, the numerical value of the generalized admiralty coefficient obtained after substituting an is equal to the distance (along the power axis) of this operational point from a surface parallel but identical to the fitted calm-water reference surface, as shown below. P s,x ) and the parallel surface, which is passing through the origin in log scale and identical (in shape and orientation) to the fitted calm-water reference surface. Now, it is well-known that the line representing the shortest distance between a point and a surface is perpendicular to the surface. Thus, the cosine of the angle (θ) between the line representing the shortest distance and the power axis (in log scale) is given by: Using simple geometry (shown in Fig. 1), the distance of (Δ x , V x , P s,x ) from the reference parallel surface (passing through the origin) along the power axis is given by d x /cosθ = ± ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ m 2 + n 2 + 1 √ ⋅d x . Thus, the idea of using the generalized admiralty coefficient as a hydrodynamic performance indicator is mathematically equivalent to calculating and comparing the distance (along the power axis) of the operational points from the reference calm-water surface in speed-power-displacement domain.
On a different note, the discussion presented in the previous sections indicates that the values of these exponents may vary over the speeddisplacement operational domain, i.e., the actual speed-powerdisplacement calm-water surface is probably log-non-linear. In such a case, the above presented linear regression model may be used to piecewise fit the available data and obtain several values of these exponents. This would be equivalent to fitting the log-non-linear reference calmwater surface by several patches of log-linear surfaces in order to account for non-linearities. 1 The results obtained using such an approach are also presented in the current work. Nevertheless, the ratio V n Δ m /P s , further referred to as the generalized admiralty coefficient, with an appropriate value of m and n, can be used as an operational performance indicator for the ship which can be easily monitored over time.

Fouling friction coefficient (ΔC F )
The traditional method to evaluate the performance of a ship observes the trend in fouling friction coefficient (ΔC F ) over time. The fouling friction coefficient is calculated as the difference between the total resistance coefficient (C T,Data ) obtained from the in-service data in calm-water conditions and the total resistance coefficient (C T,Emp ) obtained from a well-established empirical method or model test results.
Multiplying the above equation with the non-dimensionalizing factor (1/2ρSV 2 ) and again with the ship speed (V) results in an equation in terms of effective power. Further dividing the resulting equation with propulsion efficiencies would result in the same equation in terms of shaft power (P s ).
Now, the above equation is clearly the distance (along the power axis) between the observed operational point (Δ, V, P s ) and the reference calm-water speed-power-displacement surface, determined by the adopted empirical method.
From the above and the discussion in the previous section, it is clear that the proposed method in the current work (using generalized admiralty coefficient) and the traditional method is based on the same underlying logic, i.e., observing the distance between the operational point and the reference calm-water speed-power-displacement surface. Finally, it should be noted here that the results from the traditional method would most definitely depend on how well the adopted empirical method's reference surface mimics the actual calm-water speedpower-displacement surface for the given ship. So it is critically important to validate the adopted empirical method for the given ship using the in-service data before using it for performance predictions. The new performance indicator, introduced in the current work, clearly, does not have any such problems.

OLS regression
A linear model, defining the relationship between a response variable and a group of independent variables, can be written in the form: is the set of p independent variables (including the intercept), and ϵ contains the Normally distributed zero-mean residuals, i.e., ϵ ∼ N(0, σ 2 ), where σ 2 is the true residual or error variance.
The coefficients, β, can be estimated using least squares regression as follows: In an ordinary least squares (OLS) regression, the above parameter estimates (β) are obtained by minimizing the sum of squared residuals (SSR): The estimated parameters are statistics, and therefore, they have their corresponding sampling distributions. If the model assumptions are correct, these sampling distributions are also Normally distributed, and the estimated parameter values are the means of these sampling distributions. The variances of these sampling distribution can be calculated using the true error variance (σ 2 ) and the regressors (X) as follows: . 1. Showing the angle (θ) subtended between the perpendicular dropped on the reference surface from the point (Δ x , V x , P s,x ) and the line marking the distance of the point (Δ x , V x , P s,x ) from the reference surface along the shaft power axis (in log scale). The figure shows the 2D projection of a 3D space, assuming that the third axis (ln Δ) is protruding out and above the 2D plane.
Since the true error variance (σ 2 ) is not known, it can be approximated by its best estimate (s 2 ) obtained using n samples, and it can be further used to obtain the standard errors (SE) (or standard deviations) of the estimated parameters as follows:

Quasi-steady filter
The current method is only applicable for data samples obtained in a quasi-steady state. In other words, the acceleration of the ship at each time step must be negligible. To ensure this, a two-stage filter is implemented to remove the samples with non-zero acceleration (further referred to as unsteady samples). The first stage of the filter uses a sliding window to remove unsteady samples as proposed by Dalheim and Steen (2020), while the second stage enforces an additional gradient check for the samples failing after the first stage.
In the first stage, a sliding window is used to observe the slope of a fitted straight line, using linear regression. Further, a student's t-test is done to check for non-zero slope (representing unsteady behavior) and a pass(1)-fail(0) test-statistic is calculated for each window. Each sample is, then, assigned a front and rear test-statistic which are obtained as the test-statistic calculated for the window when the given sample was at the front and rear end of the sliding window, respectively. It should be noted that the front end of a window is in the direction of the motion of the window as explained in Fig. 1 in Dalheim and Steen (2020). Finally, each leg of unsteady behavior is identified as a leg starting with rear test-statistic failure and ending with front test-statistic failure.
The second stage filter calculates the backward gradient or slope only at the samples failing in the first stage and performs a students ttest, like the first stage, to check for non-zero slope. The samples indicating non-zero slope are finally removed as non-quasi-steady samples.

Data
The current work is based on the extended dataset obtained from the same sources as Gupta et al. (2019). The complete dataset is an assimilation of in-service measurement data recorded onboard a ship and weather hindcast data.

Ship data
The data is recorded onboard a ∼ 200m long general cargo ship with installed capacity of ∼ 10MW (MCR 2 ) equipped with Marorka Online 3 web application. The data used here is recorded over a duration of about 3 years covering several voyages around the globe (shown in Fig. 2), and it contains uniformly sampled 15 minutes mean values for each recorded variable. The recorded variables are further used to calculate some additional variables, which are more appropriate for the current analysis, for instance, mean draft, trim-by-aft, displacement etc. The recorded data is filtered to extract the samples recorded during a sea voyage,   (2019), for further overview of the recorded data variables as well as a brief description about the first part of preprocessing.

Hindcast data
The weather hindcast data, for wind and waves, is obtained from European Centre for Medium-Range Weather Forecast (ECMWF) (Copernicus Climate Change Service (C3S) (2017)). The ECMWF data is obtained from ERA5 HRES (High Resolution) climate reanalysis dataset. The spatial resolution of ERA5 HRES, used here, is 0.25 ∘ and temporal resolution is 1 hour. The weather data variables (presented in Table 2) are linearly interpolated in space and time to ship's location using the available navigation data. Fig. 3 shows the distribution of total wind speed and significant wave height encountered by the ship during the data recording duration. Fig. 4 presents the speed-through-water (or log speed) vs shaft power from the raw data recorded over a period of 3 years onboard the ship. The figure shows a good spread over a speed range of 6 ∼ 16 knots. The design speed of the ship is 15.5 knots. In comparison to a typical calmwater curve obtained from model tests or numerical simulations, the raw data in Fig. 4 shows a good variation in power for a fixed speed. This is expected to occur due to variation in loading conditions and environmental loads. Nevertheless, this does not explain the samples with quite high shaft power at almost zero speed-through-water. A closer analysis reveals that such samples are obtained due to non-zero accelerations, i. e., periods when the ship is accelerating, for example, due to voluntary increase in shaft rpm by the ship master.

Data exploration & pre-processing
Quasi-steady filter: Although, it may be possible to use the samples with non-zero acceleration (further referred to as unsteady samples) after correcting for the effect of acceleration of the ship, it is decided to remove these samples for the current work. After removing all such samples, the ship can be assumed to be in a quasi-steady state at each observed sample. Unfortunately, the speed-through-water (or log speed)    measurements cannot be used as a means to remove these samples due to several reasons. The speed-through-water measurements are quite noisy due to inadequate sensor accuracy, and the ship speed would also contain accelerations and decelerations due to changing environmental loads, which must be retained in the filtered data. But since the data here is averaged over the last 15 minutes, and 15 minutes seems to be long enough for the ship speed to catch up to the rpm change command, shaft rpm measurements can be used to remove unsteady samples. Thus, a quasi-steady filter, presented previously, is applied to the shaft rpm time series to filter out the unsteady samples. Fig. 5 shows a small section of shaft rpm time series with quasisteady filter in action. The figure contains two legs of unsteady behavior, one just before sample 20360 and the other around sample 20380. As clearly observed in the figure, the first (1st) stage filter also intends to removes some steady samples at the beginning and end of unsteady legs. The second stage filter helps retain these samples. The right-hand side subplot in Fig. 4 shows all the samples remaining after applying the quasi-steady filter on the raw data. Comparing the raw and the filtered data in Fig. 4, most of the points with small speed but very high shaft power are removed.
Ship heading estimation: The direction of heading of the ship is required to estimate the environmental loads acting on the ship. Although the gyro and COG headings are recorded onboard the ship (as shown in Table 1), it is observed that there were some errors in these measurements. The recorded heading variables were filled with zeros in the latter part of the time series. The ship heading is, therefore, estimated using the latitude and longitude variables recorded onboard the ship. The estimated ship heading is further validated against the first (non-zero) part of the recorded COG heading time series.
Draft correction: In general, draft measuring sensors are calibrated to convert measured pressure to water column height, resulting in draft measurements. But due to Venturi effect (or non-zero dynamic pressure), when the relative velocity between ship and fluid is non-zero, the actual measured pressure is smaller than the actual hydrostatic pressure, thus, the measured draft is smaller than the actual draft. Therefore, the draft measurements are corrected to account for this effect by interpolating the draft during an individual trip by using the initial and final draft measurements (when the ship speed is negligible).
Displacement estimation: Mean draft or draft at mid-ship and trim-byaft are obtained as the mean and difference, respectively, of the above corrected aft and fore draft (assuming that the measured draft aft and draft fore are drafts at aft peak and fore peak, respectively). The mean draft and trim-by-aft are used as the input parameters to linearly interpolate the displacement for each data sample using the hydrostatics obtained from the 3D model of the ship. It is worth remembering that using the 3D model to estimate the displacement of the ship is also an approximation due to inherent discrepancies between the model and the real ship. As an estimation of error, it was observed that the displacement obtained using the 3D model, without the appendages, was about 73 tonnes more than the value reported in the sea trial report of the vessel for the same draft and trim settings. In view of this, no further corrections were made to account for an increase in displacement due to the appendages.

RESULTS
The results are divided in the following 4 sections. The first section presents the fouling friction coefficient (ΔC F ) calculated using the traditional method, which is further used for correcting the data for performance variation in time due to marine fouling. The next section presents the averaged constant value of displacement (m) and speed (n) exponents estimated using the in-service data recorded onboard a seagoing ship. The third section does the same but, here, the model is fitted piece-wise over the speed-displacement domain after dividing it into a regular grid, thus, presenting a grid of statistically fitted values for m and n. The final section presents a comparison of the obtained performance indicator, i.e., the generalized admiralty coefficient, with the most widely accepted performance indicator, the fouling friction coefficient (ΔC F ), as well as a demonstration regarding the use of the obtained performance indicator as a tool to monitor the hydrodynamic performance of a ship.

Calm-water in-service data
The calm-water in-service data is obtained by further filtering the steady-filtered data (shown in Fig. 4) for near-calm-water limits, total wind speed (|V Wind |) less than 5.5 m/s (equivalent to Beaufort scale 3) and significant wave height (H S ) less than 1 m. As mentioned before, the data used here is recorded over a duration of about 3 years. The ship's propeller was cleaned 6 times in this duration. Based on these propeller cleaning events, the filtered near-clam-water data is divided into 7 legs with a propeller cleaning event falling between two consecutive legs. Fig. 6 shows the filtered near-calm-water in-service data in a log speed (or speed-through-water) vs shaft power space for all the legs. Fig. 7 shows the distribution of the filtered data in different legs as well as the distribution of data in all the legs combined (leg All).

Environmental load corrections
The calm-water in-service data presented in the previous section is corrected, in some cases, for wind and wave loads using empirical methods. Fujiwara's method (Fujiwara et al. (2005)) is used for wind Fig. 6. Filtered near-calm-water in-service data. The time series data is divided into 7 legs with a propeller cleaning event falling between two consecutive legs. 8 load corrections (as recommended by ITTC (2017)), and DTU's method (Martinsen (2016), Taskar and Andersen (2021), based on an approach which uses the strip theory (Salvesen (1978)) and the asymptotic limit (Faltinsen et al. (1981)) to obtain the added wave resistance transfer functions, is used for wave load corrections. DTU's method for wave load corrections is used here with the help of ship simulation workbench (Taskar and Andersen (2019b)), and it provides added wave resistance corrections for the relative mean wave heading from head (180 ∘ ) to beam seas (90 ∘ ).
The total propulsive efficiency (η D ), for the given ship, is interpolated using the data available from the model test results for the ship. A linear interpolation grid is created over the speed vs mean draft domain using the model test data for interpolating η D for each data sample. For samples outside the interpolation grid (for example, the samples with smaller ship speed which are outside the model test range), the nearest value on the grid is used.

Fouling friction coefficient (ΔC F )
The data used for the current work is recorded over a duration of about 3 years and consists of numerous voyages. The ship, usually, remains static for sometime between each voyage causing a build-up of marine fouling on the hull and propeller (Malone et al. (1981)). Moreover, the given data is affected by several propeller cleaning events, and a propeller cleaning activity may considerably influence the performance of a ship (Townsin (1982)). Thus, to obtain a good estimate of displacement (m) and speed (n) exponents, the data should be corrected to account for performance variation due to such phenomenon.
The shaft power measurement data is corrected for variation in performance over time due to marine fouling. These corrections are calculated by observing the trend in fouling friction coefficient (ΔC F ) with respect to the cumulative ship static time. The fouling friction coefficient (ΔC F ) is calculated using Eq. 6 with C T,Data calculated using the near-calm-water in-service data (presented in Section 6.1) with environmental loads corrections, using the method described in Section 6.2. The total calm-water resistance coefficient (C T,Emp ) is calculated using updated Guldhammer and Harvald's method (Kristensen and Bingham (2017)), as it is found to be fitting well for the given ship. The cumulative ship static time is calculated as the cumulative time (in seconds) for which the ship speed remains less than 3 knots (as suggested by Malone et al. (1981)). The fitted trend lines are used to calculate shaft power corrections to remove the effect of fouling, as shown in Fig. 8.
The calculated ΔC F values shows quite small variation with time and most of the values are in the negative range. The small variation indicates little fouling build-up. This can be attributed to the fact that the data used here is obtained from a newly-built ship (from first 3 years of service) and the anti-fouling systems are quite effective. The obtained ΔC F values are negative likely due to the fact that the method used here to calculate C T,Emp is overestimating the calm-water resistance for the given ship.

Simple regression
The current section presents the averaged constant value of exponents estimated using the filtered near-calm-water in-service data (presented in Section 6.1) recorded onboard a ship over a duration of about 3 years. The following two models are used to calculate the displacement (m) and speed (n) exponents: • OLS: An ordinary least squares model, based on Eq. 3, fitted to nearcalm-water data, presented in Section 6.1.

Fig. 7.
Filtered near-calm-water in-service data (same as in Fig. 6) as violin plot. The thick black vertical lines stretch between 25% and 75% quantiles.
• OLS (Corr.): An ordinary least squares model fitted to near-calmwater data corrected for environmental loads, as explained in Section 6.2.
The exponents calculated after applying corrections for variation in performance over time due to marine fouling are also presented in this section. The fouling corrections are done by subtracting the expected increase in the shaft power due to non-zero ΔC F from the measured shaft power, as explained in Section 6.3. As a simple cross-validation test, the exponents are also presented for the filtered but uncorrected (for Fig. 9. Averaged constant exponents calculated using OLS and OLS (Corr.) models. Shaded regions represent 95% confidence intervals. Refer Table 3 for values. fouling) legs. Finally, to draw a comparison, three well-known empirical methods are used to calculate the averaged constant value of exponents for the given ship. The data from empirical methods is obtained with the help of ship simulation workbench (Taskar and Andersen (2019b)). Discussion: Table 3 presents the results obtained from the OLS and OLS (Corr.) models for all the legs, the complete dataset (leg All) and the time corrected dataset (leg All-T). Fig. 9 shows the estimated exponents in graphical format along with 95% confidence intervals. Fig. 10 shows the fitted speed-power curves for the simple OLS model using the data in leg 2.
It is observed that the averaged constant displacement (m) and speed (n) exponents (shown in Table 3 and Fig. 9) obtained from the complete dataset (leg All) and the time-corrected dataset (leg All-T) are not very different. This is not surprising as the observed trends for individual legs in the fouling friction coefficient (shown in Fig. 8) are quite small, which inturn is probably due to the fact that the current data is recorded onboard a newly-built ship. It should also be noted that applying environmental loads corrections on top of the currently employed near-calm-water filtering limits (|V Wind | < 5.5m /s & H S < 1m), as done in the case of the OLS (Corr.) model, may not be necessary as the simple OLS model produces a good estimate for m and n as compared to the OLS (Corr.) model.
The speed exponent (n) obtained for the first and last leg (i.e., leg 1 & 7) using both the models is quite small as compared to the values in other legs. This is due to the fact that these legs consists of very few samples spread over a very limited range of speed axis (refer Figs. 6 and 7). Leg 5 is Table 3 Results obtained from OLS and OLS (Corr.) regression models. The full dataset is divided to 7 legs with a propeller cleaning event fall between two consecutive legs. Leg All contains the full dataset (obtained after merging all the legs). Leg All-T also contains the full dataset but it is corrected for performance variation over time due to marine fouling.  also observed to have very few samples but in this case the samples are well distributed along the speed axis. Looking at Fig. 7, it can also be said that leg 6, resulting in n ≈ 3.1, has the best coverage over the speedpower domain. This is very close to the value obtained for complete dataset (leg All & All-T) as well as the speed exponent in the admiralty coefficient. Further, discarding the results in leg 1 & 7, it is observed that the averaged constant speed exponent (n) lies between 2.8 to 3.5 for the given ship, and 3.1 is the mean as well as the most probable value of n. The fitted displacement exponent (m) seems to be varying quite substantially for different legs. Thus, it does not seem appropriate to define a reliable range for the true value of m. Observing Fig. 7, it can be seen that the samples in legs 2, 3 & 4 has a fairly good coverage over the displacement-power domain but each of them results in a very different value. It should be noted that leg 3, which shows the best coverage over the displacement-power domain, predicts the displacement exponent to be ∼ 0.6 (closest to 2/3, the exponent in the admiralty coefficient) but the results from the complete datasets (leg All & All-T) results in m ≈ 0.5.
As aforementioned, the averaged constant displacement (m) and speed (n) exponents are also calculated using the data obtained from three well-known empirical methods for the given ship (shown in Table 4). The exponents obtained here seems to be on the higher side as compared to the corresponding values from in-service data, and the displacement exponent in all the three cases is quite close to m = 2/3, as in the case of the original admiralty coefficient. In case of empirical methods, a weighted least squares (WLS) regression model (refer Appendix A) has to be used to calculate the exponents, as explained in Appendix B.

Piece-wise regression
It may be expected that the log-linear assumption, taken in Eq. 3, would result in an inaccurate modeling of the speed-power-displacement reference surface. The extensive literature survey presented in the current work also suggests that the speed (n) and displacement (m) exponents are not constant over the complete speed-displacement domain, thus, indicating a log-non-linear relationship between speed, power and displacement. Nevertheless, acknowledging the fact that all non-linear surfaces are piece-wise linear, it is, therefore, possible to fit a log-linear relationship to a greater degree on the given data by dividing it into several small pieces over the speed-displacement domain. This is equivalent to modeling the log-non-linear surface in speed-power-displacement 3D space by several patches of log-linear surfaces. Fig. 11 presents the displacement (m) and speed (n) exponents calculated using the calm-water data (shown by faded hollow circles) obtained, for the given ship, from updated Guldhammer and Harvald's method by piece-wise fitting the log-linear model. The data is, first, divided into rectangular blocks, shown by grid-like lines in Fig. 11, and, then, the exponents are calculated for each block by using simple OLS regression in log scale. Finally, the calculated values are used to create the m and n contours over the whole speed-displacement domain. The goodness of fit is indicated (in Fig. 11 title) by the minimum R-squared (R2) and maximum root mean square error (RMSE) obtained from the blocks in the fitted domain. The results confirm the expected log-nonlinearity or non-constant and varying value for displacement (m) and speed (n) exponents over the speed-displacement domain. Observing the values in Fig. 11, it can be noted that the major area of the contours corresponds to the values of the exponents in the original admiralty Table 4 Averaged constant displacement (m) and speed (n) exponents obtained from the conventional empirical methods for the given ship.  . 11. Displacement and speed exponents (m and n, respectively) calculated piece-wise using the calm-water data obtained using updated Guldhammer and Harvald's method (Kristensen and Bingham (2017)) for the given ship. The data used for fitting the model is shown by faded hollow circles. The grid-like lines divide the data into pieces or blocks which are further used to carry-out log-linear OLS regression. The exponents obtained for each block are further used to obtain the contours.

coefficient.
Obtaining a similar contour from the in-service calm-water data is far more complicated due to several obvious reasons like sparsity of data, non-uniform distribution of data, etc. Fig. 12 shows the displacement (m) and speed (n) exponents obtained from in-service data. The data used here is corrected for both time and environmental loads (same as OLS (Corr.) model in leg All-T in the previous section). Although some of the obtained values are comparable with the values in Fig. 11, the values are still not consistent enough to create a contour. Moreover, it should be noted that the grid blocks used in Fig. 12 are substantially bigger than the blocks used in Fig. 11. Smaller blocks results in many more inappropriate values of m and n as the variation due to noise in the data becomes larger than the variation due to the actual trend.
The inconsistencies in the values of exponents observed in Fig. 12 indicate that the fitted patches of log-linear surfaces will not produce a very smooth speed-power-displacement surface for the ship. In view of that, the averaged constant exponents, obtained for leg All-T in the previous section, are used to formulate the performance indicator for the current case.

Performance indicator
As aforementioned, the underlying logic behind the traditional method for performance prediction, using the fouling friction coefficient (ΔC F ), as well as the current method is based on observing the distance (along the power axis) between an operational point (Δ, V, P s ) and the reference speed-power-displacement calm-water surface for the ship. In case of the traditional method, the reference surface is usually obtained using an empirical method (which is nothing but a regression model fitted on the model test results obtained from several generalized hull forms) or the model tests conducted for the ship during the design stage. The reference surface obtained using an empirical method may not fit well for the given ship, and the model test results may introduce unknown scale effects while estimating the reference surface for the full scale ship. It is, therefore, critically important to validate the reference surface obtained from these sources using the in-service data recorded onboard the full scale ship. The method proposed, here, establishes the reference calm-water surface directly using the in-service data recorded onboard the ship. Thus, it does not need any further validation. Fig. 13 shows the reference surfaces used by the traditional method (top row), i.e., the best fitted empirical method (updated Guldhammer and Harvald's method, Kristensen and Bingham (2017)) for the given ship, and the current method (bottom row). The reference surfaces in Fig. 13 are, first, divided into a number of sections based on the displacement (Δ), as indicated on the top of the subplots in the first row. The filtered near-calm-water in-service data (presented in Section 6.1), without any corrections, falling in the range of the surface section is, then, plotted with it. The vertical distance (along the power axis) between the reference surface and the in-service data samples is indicated by the color intensity of the data samples, with red being on top of the surface and blue below the surface. The goodness of fit for each surface section is indicated by RMSE and R2 parameters, shown on top of each surface section subplot. It should be noted that these subplots are projections of 3D surfaces on a 2D plane but the distance (indicated by color intensity) between the in-service data samples and the surface are calculated in 3D. Fig. 13 clearly shows that the reference surface from the current method has a better fit for lower displacement range whereas the reference surface predicted using the updated Guldhammer and Harvald's method (Kristensen and Bingham (2017)) fits better in the higher displacement range (clearly noticeable for Δ = [45000,55000)). This can be attributed to the fact that the in-service data, used for estimating the reference surface for the current method, has more number of samples in the lower displacement range (as shown in Fig. 7). A better distribution of in-service data would result in a more accurate reference surface.
Further, the obtained reference surface can be used to predict the performance of the ship over time by calculating the value of the generalized admiralty coefficient (with statistically estimated values of displacement (m) and speed (n) exponents from leg All-T in Table 3) using the filtered near-calm-water in-service data with environmental load corrections, explained in Section 6.2. Fig. 14 shows the evolution of obtained performance indicator with ship static time along with the trend lines. Here, each hollow circle represents the mean generalized admiralty coefficient obtained using the in-service data recorded at the corresponding ship static time. As in the case of ΔC F method, the filtered near-calm-water in-service data used here is corrected for environmental loads, so that a clear comparison can be drawn between the current method and the performance predictions by the traditional ΔC F Fig. 12. Displacement and speed exponents (m and n, respectively) calculated piece-wise using the in-service calm-water data. The data used here is corrected for time as well as environmental loads. method (shown in Fig. 8).
Comparing Figs. 14 and 8, it can be seen that both the methods predict an unnatural trend in the performance change of the ship for some legs (leg 1, 6 & 7 for the current method, and leg 7 for the traditional method). Moreover, both the methods predict a drop in performance after the last propeller cleaning event (between leg 6 & 7). Looking at the slopes for each leg, it is quite noticeable that both the methods predict the biggest performance drop in leg 4 and the second biggest drop in leg 3, and the predicted performance drop in leg 2 is quite comparable. Lastly, observing the overall trend (leg All), the current method (generalized admiralty coefficient) seems to be predicting an appropriate trend showing a drop in performance whereas the traditional method predicts an unnatural increase in the performance of the ship over a duration of 3 years.

Conclusion
The current work establishes a simple hydrodynamic performance indicator, in the form of generalized admiralty coefficient, to predict the change in performance over time for a sea-going ship using the inservice data recorded onboard the ship. The in-service data recorded onboard a new-built sea-going ship over a period of about 3 years is used to statistically obtain the speed and displacement exponents in the generalized admiralty coefficient for the ship. The fitted generalized admiralty coefficient represents the reference speed-powerdisplacement surface, in calm-water condition, for the ship.
The extensive literature review presented here indicates a log-nonlinear nature of the true reference speed-power-displacement surface for modern hull forms operating in calm-water. To account for these non-linearities, the reference surface is fitted piece-wise using several log-linear surface patches but the results produced using the piece-wise approach did not produce consistent values, due to large amount of   13. Comparison between the calm-water reference surface used by the traditional method (calculating ΔC F using the best fitting empirical method for the given ship, updated Guldhammer and Harvald's method (Kristensen and Bingham (2017))) and the current method for performance prediction. noise in the in-service data. Therefore, the reference surface, assuming a log-linear form as per the generalized admiralty coefficient, is used here for predicting the performance of the ship over time.
The fitted log-linear reference surface and the performance predictions made using the fitted surface are validated by carrying-out a thorough comparison with the traditional method, i.e., observing the trend in fouling friction coefficient (ΔC F ). The performance prediction results are found to be in good agreement with the results from the traditional method, indicating that the non-linearities in the actual reference surface are not significant.
As the results from the current method are well-validated here, it provides the ship operators with a simplistic and easily implementable method to monitor the hydrodynamic performance of a ship directly using the in-service data, thereby, removing the dependence on empirical methods or model test results. The reference speed-powerdisplacement surface for calm-water conditions (represented by the generalized admiralty coefficient) can be easily estimated using the inservice data without carrying-out any environmental load corrections and marine fouling corrections. The environmental load corrections can be avoided by using a near-calm-water filtering limit for the in-service data, and the data recorded onboard a new-built ship may not need fouling corrections, as indicated by the results in the current work. Thus, the performance of a ship can be simply monitored by observing the trend in the generalized admiralty coefficient (with statistically estimated exponents) using the filtered near-calm-water in-service data.
The results also indicate that the exponents used in the original admiralty coefficient are probably not valid for modern hull forms, but the log-linear relationship can still be used, as an approximation, to represent the true reference surface. It should be noted that the results obtained using the current method are highly dependent on the quality of the in-service data. Moreover, the current method requires an initial data recording time (to estimate the speed and displacement exponents) before it can be used for predicting the performance of a ship, but once the reference surface is established for a ship, using the current method, it can be used to predict the performance of the ship for the rest of its life very easily. On the other hand, the results obtained from the traditional method would surely depend on the validity (for the given ship) of the method used for calculating ΔC F and may lead to inaccurate results due to various reasons like scale effects (as the reference surface used in that case is estimated using the data obtained from model test results of generalized hull forms or the given ship). Therefore, the current method, i.e., using the generalized admiralty coefficient statistically fitted on the in-service data recorded onboard the given ship, proves to be a more robust method for the performance prediction of the ships over time.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A. Weighted least squares (WLS) regression
In OLS regression, it is inherently assumed that all the fitted samples holds equal importance or weightage. Thus, all the samples used for fitting the model exert an equal influence over the parameters being estimated. A model that treats all of the samples equally would give less precisely measured points more influence than they should have and would give highly precise points too little influence. In statistical terms, OLS assumes that the standard deviation of error term is constant over all the values of independent variables. This assumption, however, is not valid for all the models.
In WLS regression, the fitted samples are assigned unequal weights so that the samples with higher weights exert a higher influence over the parameters being estimated. The size of the weight may also indicate the precision of the information contained in the associated observation. Here, the estimates are obtained by weighted SSR instead of ordinary SSR. Where W = diag([w 1 , w 2 , …, w n ]) is a diagonal matrix with the diagonal containing weights assigned to n given samples. The standard errors of the estimated parameters can be further calculated as follows: WLS models are, generally, used to treat datasets with non-constant error variances, or heteroscedasticity, identified as a funnel shape in the residual plot (James et al. (2013)). In order to obtain the most precise parameter estimates, the weights should be defined as inversely proportional to the variance of the quality of information in the samples. In other words, each weight should be directly proportional to the preciseness of the corresponding sample.

Appendix B. Empirical methods
It is well-known that the results obtained using a statistical machine-learning method is highly susceptible to biases, mainly, due to an uneven distribution of data samples. Thus, it is considered very important to do a validation study, if possible, using a previously known and well-established method. In order to validate the above results, the displacement (m) and speed (n) exponents are also obtained for the given ship using following three empirical methods: a) Guldhammer and Harvald's method (Guldhammer and Harvald (1970)) b) Updated Guldhammer and Harvald's method (Kristensen and Bingham (2017)) c) Hollenbach's method (Hollenbach (1998)) These three empirical methods are, first, used to calculate the calm-water resistance and the total propulsive power for the given ship over a uniform speed vs mean draft grid (keeping zero trim). An OLS regression model is, then, fitted on these calculated values as per the relation given in Eq. 3. Table 4 presents the estimated parameters obtained from all the three empirical methods. Observing the R-squared (R2) values for the OLS model, it seems to be having a very good fit but the residuals plot (shown in Fig. B.15) clearly indicate that the model does not actually fit the data well. 4 As shown in Fig. B.15, the simple OLS model shows an increasing trend in residuals with increasing propulsive power as well as other variables. This is due to the fact that the linear regression model is being fitted in log scale and, therefore, the OLS model minimizes the sum of square residuals (SSR) in log scale. In order to obtain a better fitted linear model, a WLS regression model is used with weights as the square of propulsive power (i.e., w i = Y 2 i ) so as to give higher weights to higher propulsive power samples. Fig. B.16 shows the residuals for the WLS model. The WLS models is, clearly, a better fit and it results in a substantially smaller RMSE (as shown in Table B.5b).
It should be noted that the current behavior is not observed in case of the OLS and WLS models while using the in-service data because the data samples in that case are sparsely distributed with substantially fewer samples in lower shaft power range, thereby automatically giving higher weights to samples with higher shaft power measurements. On another note, from Fig. B.15, it can be observed that trend in residuals is not linear, as one might expect assuming a log-linear relation assumed in the generalized admiralty coefficient. This is due to the fact that the data is non-linear in log scale and, therefore, the log-linear relation is a mere simplification of a more complex problem. Fig. B1. Residuals for OLS model fitted on the calm-water data obtained using the updated Guldhammer and Harvald's method. Y = Propulsive power. 4 It is well-known in statistical community that a thorough investigation of residuals is mandatory to judge the goodness of fit of any statistical machine-learning model, just observing the goodness of fit parameters like R2, RMSE, etc. is not sufficient.