Ensemble Kalman Filter data assimilation and storm surge experiments of tropical cyclone Nargis

Data assimilation experiments on Myanmar tropical cyclone (TC), Nargis, using the Local Ensemble Transform Kalman Filter (LETKF) method and the Japan Meteorological Agency (JMA) non-hydrostatic model (NHM) were performed to examine the impact of LETKF on analysis performance in real cases. Although the LETKF control experiment using NHM as its driving model (NHM LETKF) produced a weak vortex, the subsequent 3-day forecast predicted Nargis’ track and intensity better than downscaling from JMA’s global analysis. Some strategies to further improve the final analysis were considered. They were sea surface temperature (SST) perturbations and assimilation of TC advisories. To address SST uncertainty, SST analyses issued by operational forecast centres were used in the assimilation window. The use of a fixed source of SST analysis for each ensemble member was more effective in practice. SST perturbations were found to have slightly positive impact on the track forecasts. Assimilation of TC advisories could have a positive impact with a reasonable choice of its free parameters. However, the TC track forecasts exhibited northward displacements, when the observation error of intensities was underestimated in assimilation of TC advisories. The use of assimilation of TC advisories was considered in the final NHM LETKF by choosing an appropriate set of free parameters. The extended forecast based on the final analysis provided meteorological forcings for a storm surge simulation using the Princeton Ocean Model. Probabilistic forecasts of the water levels at Irrawaddy and Yangon significantly improved the results in the previous studies.


Introduction
Tropical cyclones (TCs) and associated storm surges are the most destructive meteorological disasters in South and Southeast Asia, causing massive loss of life and huge damage. In 2008, the cyclone Nargis struck southeastern Myanmar, claiming more than 100 000 lives (Webster, 2008). This enormous loss of life was mainly due to the storm surge associated with Nargis in the coastal area of the Irrawaddy delta. As a critical meteorological phenomenon, Nargis has attracted much research on its genesis (Kikuchi et al., 2009;Kikuchi and Wang, 2010), its motion with a rapid intensification (Yamada et al., 2010) and its predictability Kuroda et al., 2010;Saito et al., 2010;Shoji et al., 2011).
Regarding the predictability of Nargis, the question whether an early warning message could have been dis-seminated is the main concern. To investigate this problem, Kuroda et al. (2010) applied a dynamical downscaling from the global model GSM of Japan Meteorological Agency (JMA) using the JMA's operational non-hydrostatic model (NHM) with a horizontal grid spacing of 10 km. This is a simple method suitable for developing countries like Myanmar in employing an operational numerical weather prediction (NWP) system. Although the forecast outperformed the GSM forecast especially in intensity, the forecasted track exhibited a northward displacement, and the forecasted intensities were still under-estimated compared with the best track data of the Regional Specialized Meteorological Center (RSMC) New Delhi. Kunii et al. (2010) and Shoji et al. (2011) used a more sophisticated method to obtain better initial conditions for Nargis, which could have led to better forecasts. These authors adapted the JMA's regional variational meso-scale data assimilation system for the Nargis case. The results were better than those of Kuroda et al. (2010) since the analysed vortex structure *Corresponding author. email: leduc@jamstec.go.jp Tellus A 2015. # 2015 L. Duc et al. This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), allowing third parties to copy and redistribute the material in any medium or format and to remix, transform, and build upon the material for any purpose, even commercially, provided the original work is properly cited and states its license. became more consistent with the used model. For a meteorological disaster with a high risk of damage like Nargis, the probabilistic information is preferable. Saito et al. (2010) performed an ensemble forecast for Nargis and its associated storm surge. The initial and boundary perturbations interpolated from the large-scale perturbations of the JMA's 1-week global ensemble prediction system (WEP) were added to the downscaling system of Kuroda et al. (2010). Although the 10 km ensemble forecast significantly improved the intensity forecast compared with the JMA's WEP ensemble forecast and the track forecasts were more informative than the deterministic forecast, the northward displacement and the underestimation of intensities seen in Kuroda et al. (2010) remained.
In this study, we have innovated the previous works by the above authors on the predictability of Nargis. The problem was approached using a method that can generate both the initial condition and the initial perturbations for an ensemble forecast, namely, the Ensemble Kalman Filter (EnKF). Many studies (e.g. Zhang et al., 2009;Hamill et al., 2010) have shown that EnKF is a promising method to produce initial conditions for TC prediction. One of the advantages of EnKF is the direct use of nonlinear observation operators in assimilation. Chen and Snyder (2007) have proposed to use an observation operator for TC positions and intensities, which is equivalent to assimilation of these parameters in EnKF. Wu et al. (2010) proceeded further by assimilating TC speeds and TC wind structures associated with a prescribed intensity. However, EnKF itself has some intrinsic limitations like the spin-up problem as shown in Caya et al. (2005) or the need of inflation and localisation of background covariances (Anderson and Anderson 1999;Hamill et al., 2001). These problems should be considered in the design of all experiments. We adopted a square root variation of EnKF called the Local Ensemble Transform Kalman Filter (LETKF) in which assimilation is employed locally at each grid point (Hunt et al., 2007). Kunii and Miyoshi (2012) have used this method together with the WRF model in a simulation of the typhoon Sinlaku over the western Pacific with good results. In this research, two authors have demonstrated the effect of SST perturbations in LETKF by drawing the SST perturbations from a set of SSTs at different dates. Based on the promising results of this study, we not only employed LETKF but also explored the possible use of more practical SST perturbations in the Nargis case. Since the observations available over the Indian Ocean for Nargis were quite sparse, assimilation of TC advisories has been taken into account as well.
The next section presents the used data and methods in detail. The main results of this study are described in Section 3. In this section, apart from the NHMÁLETKF control experiment, we examine the impact of SST pertur-bations and assimilation of TC advisories separately. The other issue in this section is the spin-up problem that is intrinsically associated with EnKF. It was found that while SST perturbations had an almost neutral impact on the forecast, assimilation of TC advisories could improve the track and intensity forecast considerably. Therefore, the latter was used in the NHMÁLETKF experiment that generated meteorological forcings for a subsequent Nargisinduced storm surge forecast. Section 4 shows the results of this ensemble forecast. Section 5 summarises the study and proposes some issues that can be the next research subjects.

Data assimilation system
In this study, the NHMÁLETKF system originally developed at JMA was adopted with some modifications. NHMÁ LETKF comprises a forecast model NHM, an assimilation module LETKF and a QC system. The driving model was the JMA's operational NHM (Saito et al., 2006;Saito, 2012). The version used was of April 2013. The QC system was derived from that of the JMA's operational meso-scale analysis system (JMA, 2013). This enabled NHMÁLETKF to access and use all observations available for the routine assimilation in JMA. The kinds of observations that NHMÁ LETKF could have assimilated consisted of conventional data, sea winds, precipitable water, atmospheric motion vectors (AMV) and radiances.
The assimilation scheme of NHMÁLETKF was 4DÁ LETKF which was originally implemented by Miyoshi and Anarami (2006), and later modified by Fujita et al. (2009). The control variables were two horizontal wind components u and v (m/s), temperature T (K), specific humidity qv (kg/kg), surface pressure ps (hPa) and ground temperature gt (K). To be consistent with the QC system used, all observation operators in NHMÁLETKF were inherited from those in the JMA's operational meso-scale analysis system (Honda et al., 2005;Honda and Sawada, 2008).
NHMÁLETKF employed localisation in the observational space. The horizontal and vertical localisation length scales are the width d of the Gaussian localisation function e Àr 2 =2d 2 , where r denotes the distances between model grid points and observations. That means a local patch for each model grid point is not needed in the observational localisation method. However, since Gaussian localisation functions are almost zero at long distances, the observations used in assimilation at each grid point can be limited into a local region to accelerate running time in practice. The size of this region should be chosen such that all observations outside this region have negligible influence on the considered grid point. These regions can be identified with the local patches of the observational localisation method.
In all experiments, the horizontal localisation length scale of 550 km was used. The coherent vertical structure of TCs has an imprint on the shape of vertical covariances which stretches over whole atmosphere columns. Therefore, the vertical localisation length scale was set to the number of model vertical levels to recognise this fact in the Nargis case. Temporal localisation was not considered in this study.
It is well known that inflation factors were introduced into the EnKFs as an ad-hoc method to account for model error covariances and loss of variance due to sampling when estimating background covariances. The inflation factors in NHMÁLETKF were estimated adaptively as described in Miyoshi (2011), which calculates inflation factors by a scheme similar to the Kalman filter type. The background or first-guess inflation factors are given from the estimated ones in the last cycle of LETKF. The observed inflation factors together with their errors are computed from the statistics of innovations. If no observations existed inside the local patch of a grid point, the estimated inflation factor at this grid point is identical to the background one. At the first cycle of LETKF, the background inflation factors must be specified by a certain value. If observations are sparse, this value will affect the estimated inflation factors not only in the first assimilation cycle but also in subsequent cycles. In NHMÁLETKF, we chose a value of 1.44 for the first value of the background inflation factor, which implies the model error covariance was assumed to be 20% of the forecasted error covariance at the first LETKF cycle.

Experiment settings
To obtain analyses for Nargis, NHMÁLETKF was run over a domain covering the Bay of Bengal as depicted in Fig. 1. This domain consisted of 201)181 horizontal grid points with 20 km horizontal grid spacing. The vertical grid had 40 levels with a 22 km top. This analysis domain was fixed in all experiments. The same geographical domain was adopted in the extended forecasts when NHM was run using the analyses from NHMÁLETKF experiments as initial conditions. However, in these subsequent extended forecasts, the horizontal resolution was refined to 10 km, and the number of vertical levels was increased from 40 to 50.
The ensemble part of NHMÁLETKF was composed of 50 members and run at a 3-hour assimilation cycle. The lateral boundary conditions for all members were obtained from forecasts of GSM and forecasts of WEP available at 0.58 and 1.258 horizontal grid spacing, respectively. Saito et al. (2012) pointed out the importance of lateral boundary perturbations in LETKF. In this study, the forecast perturbations derived from WEP forecasts were first interpolated to the analysis domain, and then were rescaled and added to or subtracted from the interpolated GSM forecasts valid at the same time. The resulting perturbed forecasts were used as the boundary conditions for the ensemble members. This process to derive boundary perturbations is described in detail in Saito et al. (2010Saito et al. ( , 2012. At the first cycle of NHMÁLETKF, this process also provided the initial perturbations for all members. Most experiments used the JMA's daily average SST analyses as the bottom boundary conditions. When perturbed SSTs were applied, SST perturbations were generated from SST analyses of seven operational centres: Fleet Numerical Meteorology and Oceanography Center (FNMOC), JMA, Jet Propulsion Laboratory (JPL), National Climatic Data Center (NCDC), National Center for Environmental Prediction (NCEP), Remote Sensing Systems (REMSS) and United Kingdom Meteorological Office (UKMO). The details of this calculation are described in Section 3.2.
In each experiment, NHMÁLETKF was started 2 d prior to the target time 12 UTC 30 April 2008 to alleviate impact of the spin-up problem. Then, the resulting analysis was used as the initial condition for a 72-hour NHM deterministic forecast. The boundary condition for this run was interpolated from GSM forecasts. In parallel with the deterministic forecast, a 50-member ensemble forecast was performed using NHMÁLETKF analysed perturbations as initial perturbations. The boundary conditions for ensemble members were derived from WEP and GSM forecasts using the same method as in the ensemble part of NHMÁLETKF.
To examine the impact from NHMÁLETKF, a direct downscaling from GSM was carried out. The JMA global analysis system using GSM as the driving model adopted the 4DVAR method in incremental form. The resolution of the inner loop was T159L60 (about 80 km, 60 vertical levels) and the one of the outer loop was T959L60 (about 20 km, 60 vertical levels). That means the effective resolution of the global analyses (80 km) was quite coarse compared to the resolution of NHMÁLETKF analyses in this study. Here, the global analyses were provided at the resolution of the outer loop.
The Princeton Ocean Model (POM) (Blumberg and Mellor, 1987) was used to forecast the storm surge induced by Nargis. The settings for the integrated domain and grid were adopted from Kuroda et al. (2010). This domain had 451)391 horizontal grid points with 3.5 km horizontal grid spacing. The number of vertical levels was 12. The surface winds and mean sea level pressures were ingested into POM every 10 minutes. These fields were interpolated from the forecasted fields of the NHM extended runs. The forecast range was the same as that of the extended run (72 hours). Note that POM was run without taking into account the effect of astronomical tides, thus, the storm surge forecasted only resulted from meteorological factors. Figure 2 illustrates the number and kind of observations assimilated by NHMÁLETKF at an arbitrary time (12 UTC 29 April 2008). Almost all observations received in the JMA's global analysis system over the Bay of Bengal were assimilated into NHMÁLETKF except radiance data. Here, radiances were not used since the bias correction module for these data has been under development. Note that both the JMA global analysis system and NHMÁ LETKF only assimilated radiance data in clear-sky regions. Using the operational global analysis system at the National Aeronautics and Space Administration for the same Nargis case, Reale et al. (2009) pointed out that radiance data had substantial impact only when cloudy radiances were assimilated. Therefore, analyses were expected to be improved if NHMÁLETKF could assimilate radiance data in cloudy regions.

Observations
In the assimilation domain, the observations were quite sparse. Over the land, the main source of observations was SYNOP and radiosonde stations in South Asia countries (Fig. 2a). Over the ocean, the observations were mainly sea winds from QuikSCAT and ASCAT (Fig. 2c), and precipitable water estimated from microwave instruments like SSM/I in DMSP satellites (Fig. 2d). The sea wind dataset has a resolution of 50 km. The observation error was assumed to be 3 ms (1 . These values for precipitable water dataset were 25 km and 5 mm, respectively. In addition, cloud motion winds (AMVs) estimated from geostationary satellites METOSAT-7 and MTSAT-1R provided other valuable information (Fig. 2b). A spatial thinning of one degree was applied for AMV data to avoid observations with correlated errors.
When TC advisories were used as additional observations, this information was collected from the Joint Typhoon Warning Center (JTWC) and RSMC New Delhi. Although RSMC New Delhi is responsible for issuing the operational warning for TCs in Indian Ocean as assigned by WMO, the subjective evaluation pointed out that the data from JTWC are more realistic in this case. However, since the previous works mostly used RSMC New Delhi's TC advisories in verification, we still referred to this dataset in this study.

Verification method
TC tracks and intensities were derived from the NHM forecast fields. TC centres were defined as the average among the minimum points of mean sea level pressures (pmsl), and geopotentials at 700 (h700) and 850 hPa (h850). TC intensities were represented by the minimum values of pmsl fields. A track programme was developed to perform this task. Although the programme only used pmsl, h700 and h850 in the definition of TC centres, additional fields like wind velocities at 850 hPa and aloft temperatures were required in detecting TC centres.
All 50 TC tracks derived from the ensemble forecast in each experiment when plotted make the plot less readable. To keep the plot reasonable to follow but still retain the essential information provided by an ensemble forecast, we adopt the approach proposed by Hamill et al. (2010) by only plotting the ensemble mean of forecasted tracks and the forecasted error covariances of TC centres. This error covariance is presented as a 2x2 matrix at each forecasted time and can be estimated from the ensemble. Employing the eigenvalue decomposition for this matrix, the directions along which the forecasted errors attain the maximum and minimum values are given by the first and the second eigenvectors, respectively. Note that those directions may not coincide with the along-track or cross-track directions. The magnitudes of the maximum and minimum errors are the square roots of the corresponding eigenvalues. These values together with the directions of the maximum and minimum errors define two principal axes of an ellipse centred at the ensemble mean of TC centres. If the underlying distribution of TC centres is assumed to be the normal distribution, this 1-sigma ellipse accounts for 68% of the probability that a TC centre falls inside. These ellipses are considered as the representatives for forecasted error covariances and plotted in all figures related to track forecasts (e.g. see Fig. 4a and b).

Results
When employing NHMÁLETKF for Nargis, we found that the analysed vortex was sensitive to the assimilated sea wind observations near the Nargis' centre. In our experiments with a limited spin-up period and a limited use of observations compared with the operational system, a positive impact is expected by retaining those rejected observations. By loosening the QC criteria for sea wind observations, the additional assimilated observations have contributed to produce a better analysis and consequently a more accurate forecasted track (not shown here). Therefore, the relaxed QC programme was applied for all experiments in this section.

Control experiment
In the first experiment, NHMÁLETKF was run without any modifications on its components. The results were then compared with the direct downscaling from GSM and WEP ensemble (GSMÁWEP). Figure 3 shows the pmsl fields analysed by GSM relative to NHMÁLETKF. The Nargis' intensity as analysed by NHMÁLETKF (1002 hPa) was slightly weaker than that analysed by GSM (999 hPa). Both analysed intensities were weaker than the intensity estimated by JTWC (974 hPa). The analysed position of TC centre had an error of 0.98 with GSM and 0.58 with NHMÁLETKF. Note that Fig. 3a and b do not show the entire analysis domain but only the region around the storm centre. Figures 4 and 5 show the corresponding 72-hour Nargis track and intensity forecasts initialised by the GSM and NHMÁLETKF analyses. Clearly, both track and intensity forecasts were improved considerably when using NHMÁ LETKF. Using the GSM analysis in the extended forecast, the control run had a northward displacement similar to the previous studies Saito et al., 2010), which attained a magnitude of approximately 80 km at the landfall time (Fig. 4a). The ensemble mean forecast also agreed with the deterministic one. In contrast, the deterministic and ensemble mean forecasts based on NHMÁ LETKF run close to the JTWC's best track with no northward bias. During the first 24 hours, the track spreads in GSMÁWEP were very small which can be contributed to the use of large-scale perturbations as initial perturbations. The ensemble forecast by NHMÁLETKF did not suffer from the lack of spread as in the GSMÁWEP case.
Nargis forecasted by the GSM downscaling made landfall 9 hours prior to the actual landfall time. The forecasted central pressure reached its minimum value around this landfall time (Fig. 5a). The situation was quite different in the forecast in which the NHMÁLETKF analysis was used. Here, the deterministic forecast produced a large improvement in the track forecast, especially the landfall time (a 3-hour lag compared to a 9-hour lag in the GSM downscaling case). This resulted in the central pressures being in phase with the observed one when the Nargis' intensity reached peak intensity at the 42-hour forecast range (Fig. 5b). However, the peak intensity was 970 hPa, which was far from the value of 938 hPa estimated by JTWC. At this minimum point, the ensemble mean forecast was 975 hPa, which was 5 hPa greater than the deterministic one. This forecast is not a bad result considering a 10-km grid spacing was used in all simulations. This resolution is not fine enough to resolve the inner structure of TCs, thus, not expecting to reproduce such intense TCs.
To verify the work of NHMÁLETKF, a diagnostic analysis of the innovation vectors was performed based on the statistics of innovations where x f is the background field, y the observations, P f the background error covariance, R the observation error covariance, H the observation operator and DH the Jacobian of H. Note that LETKF does not need an explicit form for DH and uses forecast perturbations in the observation space to estimate the first term on the righthand side of eq. (1). Since only one observation is available to compute each entry in the matrix on the left-hand side of eq. (1), an average over observations at different locations is usually favoured to get more reliable statistics. Consider only radiosonde observations, Fig. 6 compares the averages of the diagonal terms of the left-hand side of eq. (1) or prior root mean square errors (RMSEs) (the solid line with circle symbols) against those of the right-hand side or predicted RMSEs (the solid line with diamond symbols) for zonal winds and temperatures at the standard pressure levels. Due to sampling errors, these two lines cannot be identical but should be somehow similar as illustrated in Fig. 6b. Figure 6a shows that the predicted RMSEs were less than the prior RMSEs at all pressure levels, which implies the forecast spreads of zonal winds were not large enough. In addition to the RMSEs of background fields (prior RMSEs), Fig. 6 also plots the RMSEs of analysis fields and their spreads. This plot points out that NHMÁLETKF worked properly when the analysis RMSEs and spreads were less than those of backgrounds, which resulted from assimilation. Although NHMÁLETKF has improved the forecast significantly, there remained a weak point in the forecast: the forecasted vortices were weaker than the estimated one, especially at the initial time and in the mature period of Nargis. Therefore, an investigation on a further improvement was carried out and is described in the next two subsections.

SST perturbations
As shown by Saito et al. (2012), SST perturbations likely have a positive impact to the underestimation of the forecast errors in the lower model atmosphere in LETKF. Kunii and Miyoshi (2012) demonstrated the effect of SST perturbations with their WRFÁLETKF system in the forecast of Typhoon Sinlaku, but the SST perturbations were artificially generated from a set of SSTs at different dates. In this study, to represent SST uncertainties in a more realistic way, the SST analyses from seven centres FNMOC, JMA, JPL, NCDC, NCEP, REMSS and UKMO were employed to perturb the bottom boundary conditions of NHMÁLETKF as described in Section 2.
All centres provided daily average SSTs, which were one global SST field per day. In addition, FNMOC provided SSTs four times per day valid at 00, 06, 12 and 18 UTC. To derive SSTs for other centres at these hours, we assumed that the deviations of temporary SSTs from daily average SSTs were equal for all centres. By adding the temporary deviations of FNMOC SSTs to the daily average SSTs of any centre, the SSTs of the corresponding centre at 00, 06, 12 and 18 UTC could be obtained. Figure 7 illustrates this process with the time evolutions of the SSTs averaged over the Indian Ocean in the assimilation period. Notice that all evolutions were in phase with each other.
The above calculation implies that SSTs were updated every 6 hours in all experiments. However, the analysis error covariances of SSTs were only updated every day. This can be seen if we write out the relation between the temporary SSTs and the daily average SSTs here, t denotes SSTs at any specific time, d daily average SSTs, i the index for each centre and DSST t deviations at any specific time. Since the deviations are assumed the same for all centres, the perturbations at any specific time become where, the averages are taken over all members. This verifies that the sample error covariances of SSTs were unchanged throughout a day.
To make a fair comparison between the LETKFs using fixed SSTs and perturbed SSTs, the experiment with fixed SSTs was conducted using the ensemble mean SSTs instead of JMA SSTs as the bottom boundary conditions. In the experiments with perturbed SSTs, we examined two methods in generating SST perturbations. In the first method (Perturbed SST 1), each ensemble member obtained its SST fields by adding the ensemble mean SSTs to a SST perturbation chosen randomly from the seven SST perturbations valid at the beginning of each assimilation cycle. In the second method (Perturbed SST 2), each ensemble member only used SSTs from one specific centre, which was chosen randomly at the first cycle of NHMÁ LETKF. Compared with the first method, the difference was that instead of drawing an SST sample at every cycle each member only employed this procedure at the first cycle and applied the obtained centre for all subsequent cycles. This method avoids the abrupt changes of SSTs between consecutive cycles that can occur in the first method, and also considers the practical aspect of including SST uncertainties in numerical weather prediction.
We remark that a perturbation could be expressed as a linear combination of seven perturbations with the linear coefficients drawn from a standard normal distribution. However, we chose a simpler method by limiting the available perturbations to the seven given perturbations (the linear coefficient is 1 for a specific centre and zero for the other).
The diagnosis analysis for NHMÁLETKF using fixed and perturbed SSTs was performed (not shown here) similar to the diagnosis analysis for NHMÁLETKF in Fig. 6. We found that the RMSEs or spreads in both cases were almost identical. Since radiosonde observations were used as the reference in the diagnosis analysis, and all radiosonde stations were in land, this explains why the impact of SST perturbations was hardly recognisable. To access the impact of SST perturbations, we explored the analysis spreads averaged just over ocean for zonal winds and temperatures as depicted in Fig. 8. Clearly, the included SST uncertainty caused the analysed temperature spreads to increase at the low model levels. Although the spreads were similar in magnitude for both SST perturbation methods at the lowest model level, the spreads of Perturbed SST 1 were smaller than those of Perturbed SST 2 at the levels above the lowest level. This shows that Perturbed SST 2 was more effective than Perturbed SST 1 in propagation of temperature uncertainty from the sea surface to the low model levels. The impact of SST perturbations was neutral for zonal winds, though a careful examination could identify a minor increase of spreads at the low levels.
The impact of SST perturbations on the forecasts of Nargis' track and intensity is shown in Figs. 9 and 10, respectively. Here, all extended forecasts applied the same SST perturbations as in the assimilation phase. Note that for the control forecasts, the same SST (ensemble mean) was used in the three experiments. In these forecasts, the track error of fixed SST (Fig. 9a) was the largest and the one of Perturbed SST2 (Fig. 9c) was the smallest after the landfall. However, in terms of central pressures Perturbed SST1 (Fig. 10b) yielded a better agreement with JTWC than Perturbed SST2 (Fig. 10c). Nargis' central pressure reached the minimum after 42 hours in Perturbed SST1, which was similar to JTWC data, and 54 hours in Perturbed SST2. In the ensemble mean forecasts, the interesting thing is that all ellipses in Perturbed SST2 were more elongated than those in Perturbed SST1, showing that the forecasted TC centres distributed more uniformly in the latter case. This means the forecast uncertainty in Perturbed SST1 was the same for all directions which produced an ensemble forecast less informative than Perturbed SST2. In general, the impact was quite modest showing that SST perturbations had an almost neutral impact on the forecast. The Nargis' track was well predicted by the forecast initialised by the NHMÁLETKF using the fixed SSTs, and it was difficult to surpass this forecast. Also, the limited number of SST perturbations is another possible reason for such neutral impact.

Assimilation of TC advisories
To assimilate TC advisories, an observation operator was implemented in the assimilation module of NHMÁ LETKF. This operator was derived from the track programme to detect TC centres and intensities as described in Section 2.4. Acting on each time slice of each member forecast for a given TC, the operator produced three parameters: the longitude and latitude of TC centre and its central pressure. These three parameters were also the information provided by JTWC or RSMC New Delhi in their TC advisories. Since these data from JTWC were more realistic than the data from RSMC New Delhi in this case, we chose to assimilate JTWC's TC advisories into NHMÁLETKF. Therefore, RSMC New Delhi data as the verification dataset were not plotted in all figures in this section.
The longitudes and latitudes of TC centres played two roles in assimilation. They were two of three components of a TC observation (TC advisory) that were assimilated into NHMÁLETKF (the other was the TC central pressure) but at the same time indicated the coordinate of this TC observation. That means TC advisories were equivalent to a special kind of two-dimensional observations like surface pressures. TC longitudes and latitudes as coordinates of TC advisories were necessarily required in localisation. In horizontal localisation, the same localisation length scale in Section 2.1 was used, whereas in vertical localisation no localisation was applied for this special kind of observations. An observation error of 0.258 was assumed for both longitudes and latitudes of TC centres. This value was slightly less than the value of 0.38 used by Chen and Snyder (2007) and Wu et al. (2010). The observation error of central pressure (P err ) was more flexible in specification. Chen and Snyder (2007) and Jung et al. (2012) assigned this error to 5 hPa. Since this parameter plays an important role in determining the analysed intensities of Nargis and its resulting forecasts, we performed three sensitivity tests with respect to different values of P err : 100 hPa, 8 hPa and 4 hPa. The first experiment in which P err equalled 100 hPa was virtually equivalent to just assimilation of TC positions in NHMÁLETKF. In contrast, the third experiment gave high reliability to the estimated intensities by putting more weight on TC advisories in assimilation.
The most obvious effect of assimilation of TC advisories was on the analysis of Nargis' intensity. Figure 11 shows the analysed pmsl fields in three experiments. The experiment in which TC advisories were not assimilated can be referred to in Fig. 3b. As expected, Nargis became deeper with decreasing P err . When P err was large, the analysed vortex was similar to the control case (no assimilation of TC advisories), but the vortex position was shifted westward. This shift, which is more clearly in Fig. 12, was in common among all experiments irrespective of values of P err . By assimilating the estimated positions, the analysed ones were corrected, and the analysed spreads of Nargis' centre became smaller. When P err was set to 8 hPa, the analysed central pressure was equal to 1000 hPa (Fig. 11b), which was 2 hPa deeper than the corresponding one in the control experiment. This value dropped to 982 hPa, representing a strong vortex, when P err was halved further (Fig. 11c). This verifies that the new observation operator worked properly in NHMÁLETKF.
When the analysed intensity of Nargis was strengthened, the largest effect was on the intensity forecast. Figure 13 demonstrates this fact in which the most obvious impact can be seen in Fig. 13c. This figure shows the time evolutions of forecasted central pressures in the experiment using P err 0 4 hPa. In the control forecast, starting at the intensity of 982 hPa, Nargis reached its peak of 962 hPa after 36 hours. The corresponding value forecasted by the ensemble mean was 967 hPa. This peak value was much less than the estimated one 938 hPa from JTWC which can be understood since a 10-km model cannot reproduce such intense TCs. The two remaining experiments exhibited the same improvement of intensity forecasts in both deterministic and ensemble forecasts.
Can this seemingly good intensity forecast be accompanied with a good track forecast? Figure 12 shows the Fig. 10.
As Fig. 9 but for the forecasted intensities.
EnKF DATA ASSIMILATION AND STORM SURGE EXPERIMENTS forecasted tracks that correspond to the forecasted intensities in Fig. 13. The noticeable thing is that when the initial vortex became stronger, the forecasted track tended to shift northward. The same result was observed in Kunii et al. (2010) when the authors employed vortex bogusing together with the 4DVAR technique to generate initial conditions for the same TC (Nargis). Here, the vortex dynamics such as beta gyres may play an important role. Chan and Williams (1987) have shown that northward movement of a TC due to the beta drift increases with its maximum sustained winds. This can be used to explain for the northward displacement observed in all experiments in which the analysed vortices were strong. Figure 13 suggests that to obtain a good forecast, a too small value should not be assigned to P err . Unlike 4DVAR method, 4D-LETKF does not use the model dynamics as a constraint in assimilation. Therefore, an intense vortex can be obtained in the analysis at a low resolution simply by reducing the value of P err , thus, putting more weight on the estimated vortex, even though the model dynamics cannot simulate such a strong vortex at this resolution. When ingested into the driving model in the next assimilation cycle, the resulting analysis causes imbalance in the forecast fields which has an effect on the analysis in the next analysis phase. This explains why we should not force the analysed intensity toward the estimated one since this value may not be appropriate for a driving model at a low resolution. When P err was 8 or 100 hPa, the track forecasts were performed very well with almost perfect landfall point. This resulted from the relocation of the initial vortex around the estimated position. This fact and the preceding evaluation show that assimilation of TC advisories had a positive impact on track and intensity forecasts providing that the observation error of central pressure was not too small. Pmsl fields analysed by NHMÁLETKF when assimilating TC advisories with different observational errors of central pressures (P err ): (a) P err 0100 hPa, (b) P err 08 hPa, (c) P err 04 hPa.  As Fig. 11 but for the forecasted intensities.

Spin-up time
To produce an analysis for Nargis at the target time, we run NHMÁLETKF in a 2-d assimilation period. At the initial time of this period (12 UTC 28 April), the analysis was interpolated from the global analysis of JMA. Since the TC vortex represented in the global analysis at this initial time was weak, this period may not necessarily be long enough so that the vortex could not reach a wellbalanced structure consistent with the model dynamics. To investigate this problem, we rerun the control NHMÁ LETKF starting 3 or 4 d before the target time instead of 2 d as in all previous experiments. No SST perturbations or assimilation of TC advisories were applied in these two additional runs. The results are illustrated in Fig. 14 through the Nargis' track and intensity forecasts. As expected, when the assimilation period was increased, the analysed central pressure dropped from 1002 to 1000 hPa. The positions of analysed vortices were also improved considerably. These vortices in both cases were quite similar to that of assimilation of TC advisories with P err 08 hPa. This seems to suggest that assimilation of TC advisories may not be needed if the assimilation period is extended.
However, these seemingly good analysed vortices yielded the track forecasts with northward displacement as depicted in Fig. 14a and c. It was even worse as Nargis was forecasted to make landfall 6Á9 hours earlier than the actual observation. This certainly would degrade the accuracy of storm surge forecast if POM was applied. What caused this difference between these experiments and the experiment that assimilated TC advisories using P err 08 hPa? Was there a connection with the previously discussed problem in which the forecast initialised with a strong vortex tended to shift northward?
Out hypothesis is that the adaptive inflation scheme in NHMÁLETKF is partly responsible for this northward displacement. As described in Section 2.1, this scheme works like a Kalman filter in estimating inflation factors. Therefore, at the first cycle of NHMÁLETKF background inflation factors must be given, which were assumed as a homogeneous field of 1.44 in all experiments. That means that over grid points with no adjacent or sparse observations this value will be kept unchanged or changed slowly throughout the assimilation period. This leads to inflation of background perturbations and consequently analysis perturbations at these grid points even if no observations contribute to their analysis increments. In short assimilation periods, this may not have a noticeable impact, but in longer assimilation periods this will affect the balance of analysis ensemble.
Can we improve LETKF to avoid such imbalance? A possible solution is that at grid points where no observations exist, it is better to keep analysis spreads the same as background spreads instead of inflation. This strategy was followed in the relaxation to prior variance method proposed by Whitaker and Hamill (2012). We have applied this inflation method for NHMÁLETKF in the typhoon Haiyan case in 2013 with promising results. We will pursue this interesting problem in our future work.

Storm surge forecast
After investigating the effect of SST perturbations and assimilation of TC advisories on forecast performance, we decided to use assimilation of TC advisories with P err of 8 hPa in the final NHMÁLETKF that provided meteorological forcings for the subsequent storm surge forecast through an extended 72-hour forecast. Since the impact of SST perturbations was almost neutral, this technique was not applied here. The extended run based on the resulting analysis is displayed in Figs. 12b and 13b. As for the control run, both the track and intensity of Nargis were quite well simulated, especially landfall location.
The extended forecast provided surface wind and pmsl fields for the storm surge forecast using POM. As the previous studies Saito et al., 2010), we examined the detailed water level forecasts at two points: Irrawaddy and Yangon. Figure 15 shows the time evolutions of wind speeds, wind directions and water levels at Irrawaddy and Yangon predicted by the ensemble forecast. The deterministic forecasts are also given by the lines of circle points. At each instant time a probabilistic forecast is represented under a form of box-and-whisker plot.
These results clearly outperform the previous storm surge simulations Saito et al., 2010). In their simulations, the highest water levels at the Irrawady and Yangon points were 3.2 m and 1.6 m in the control runs, respectively. When the ensemble forecast was used, among all ensemble members the maximum predicted values were 4.0 m at Irrawady and 2.4 m at Yangon. However, the highest water levels occurred 6 hours earlier than the estimated time. The failure of storm surge reproduction especially at Yangon in these simulations has an intimate relation to the northward bias of the forecasted tracks. When the northward bias was removed as in our run, the storm surge forecast was notably improved as depicted in Fig. 15. The control run predicted the highest water level of 2.0 m at Yangon (see Fig. 15b) with the occurrence time consistent with the actual time.
However, a careful examination at the occurrence time of the highest water level seems to indicate under-estimation in the ensemble forecast when at least 75% of members predicted water levels below 2.2 m at Irrawaddy and below 1.2 m at Yangon. This seems to be inconsistent with the good track and intensity forecasts as depicted in Figs. 12b and 13b. To investigate the cause of this under-estimation, in Fig. 15 all individual forecasts are displayed as the dash lines in addition to the box-and-whisker plots. Consider the probabilities for the water level above 3 m at Irrawaddy at each instant time, all instant probabilities are less than 0.1. However, if we disregard the time factor, and by counting the number of dash lines that cross the horizontal line Z 03 the probability becomes 0.36, which implies a high risk for flooding near Irrawaddy. Therefore, the under-estimation resulted from the fact that according to each member the highest water level occurred at different times. Consequently, at an instant time there existed only a limited number of members predicting a high water level. This means uncertainty in time should be taken into account when predicting the water level at a specific location.
The above analysis suggests a new type of plot (see Fig. 16) that includes uncertainty in time in predicting storm surges. This plot is similar to the strike probability map used in ensemble forecast of TC tracks, showing the probability that a TC is within a certain range in a certain period for each grid point. Here, each subplot in Fig. 16 shows the probabilities of the water level above a given threshold for all grid points regardless of the occurring time. These informative maps are plotted for the storm surge forecasts based on the final NHMÁLETKF and the GSM downscaling using the thresholds of 1.0 and 3.0 m. Near the Irrawaddy point the strike probabilistic maps with respect to the water level of 3.0 m are quite similar between two experiments. However, near the Yangon point, the situation is different when the water levels above 3.0 m were predicted with a small probability by the POM run using the final NHMÁLETKF forecasts. The difference becomes more evident when the threshold of 1.0 m is considered. The forecasts based on the final NHMÁLETKF clearly issued a warning for the risk of high water levels at an area larger than that based on the GSM downscaling.
Concerning the occurrence of a high water level at a certain point, although the probability of such an event is considered more important than its occurring time, the latter is still needed in warnings. Figure 16 suggests that the control and the ensemble mean can be used to predict the phase of the water level, especially the occurring time of the highest water level.

Summary and conclusion
In this paper, we have revisited the predictability problem of the Myanmar cyclone Nargis. Our approach was to use LETKF to generate an analysis for a deterministic forecast and its associated analysis perturbations for an ensemble forecast. The forecasted fields then served for a storm surge forecast using the POM model. For this purpose, we have adopted the NHMÁLETKF data assimilation system developed at JMA. Forecasts of wind speeds, wind directions and water levels based on the NHMÁLETKF using assimilation of TC advisories with P err 08 hPa at: (a) Irrawaddy point and (b) Yangon point. The ends of the whiskers represent the 5th and 95th percentiles.
The control experiment with this NHMÁLETKF (after relaxing the specific QC criteria for sea winds) resulted in a track forecast close to the estimated one. The forecasted landfall time had a time lag of 3 hours. This forecast outperformed GSM downscaling especially in the landfall location and time. However, NHMÁLETKF analysed the vortex weaker than GSM. To explore a further possibility for improvement, we have implemented two techniques into NHMÁLETKF: SST perturbations and assimilation of TC advisories.
By introducing SST perturbations, we expect SST perturbations can improve environmental flows as demonstrated by Kunii and Miyoshi (2012), thus, leading to a better track forecast. GSM usually employs vortex bogusing in the western Pacific but not for other oceans including the Indian Ocean. Therefore, assimilation of TC advisories was expected to improve the analysed vortex structure of Nargis. We have found that in our case the use of SST perturbations had a slightly positive effect on the forecast of Nargis' track. In particular, instead of choosing a SST field randomly among all SST analyses at each assimilation cycle, SST perturbations are more effective when each ensemble member was tied with a specific centre randomly assigned at the first assimilation cycle in getting its SST field. Assimilation of TC advisories had positive impacts with a reasonable choice of its free parameters.
There still exists one unsolved problem needing more investigation in future work. This relates to the northward bias of the forecasted track when the analysed vortex becomes deeper. This problem partly involves the adaptive inflation scheme used by NHMÁLETKF and will be tackled in our future research. In addition, the promising results obtained with NHMÁLETKF and POM in the Nargis case inspire us to apply the system for other cases such as the recent typhoon Haiyan in the Philippines.

Acknowledgements
This work was supported by the Ministry of Education, Culture, Sports, Science and Technology (MEXT) through Probabilistic maps of water levels above 1 m (a, c) and 3 m (b, d). The top panel shows forecasts based on the GSM downscaling, and the bottom panel forecasts based on the NHMÁLETKF using assimilation of TC advisories with P err 08 hPa. the Strategic Programs for Innovative Research (SPIRE). All results of the extended forecasts were obtained by using the K computer at the RIKEN Advanced Institute for Computational Science (proposal number hp120282, hp130012 and hp140220). We thank Nadao Kohno of JMA for providing the POM model modified for storm surge forecast and Masaru Kunii of MRI for pointing out an error in computing the inflation factor in the LETKF code. We also thank two anonymous reviewers, whose valuable comments significantly improved the quality of the manuscript.