Assimilating Ocean Observation Data for ENSO Monitoring and Forecasting

El Nino–Southern Oscillation (ENSO) is one of the most influential fluctuations of the coupled atmosphere-ocean climate system in the Seasonal-to-Interannual (SI) time scale for global and local communities. Ocean data assimilation systems are generally adopted for monitoring ENSO because the variation of the heat content in the ocean interior of the equatorial Pacific is considered a good precursor of El Ninos, and essential for understanding the ENSO process. They are also utilized for the initialization of Coupled Atmosphere-OceanGeneral Circulation Models (CGCMs), along with atmosphere data assimilation systems in the SI forecasting systems of various operational centers. In this paper, we discuss the capacity of current operational ocean data assimilation systems adopted for ENSO monitoring and SI forecasting based on studies using the system of the Japan Meteorological Agency (JMA). It then demonstrates the benefits of assimilating ocean observation data through those systems in SI forecasts using the JMA seasonal and ENSO forecasting system. It also introduces the recent effort of JMA/Meteorological Research Institute (MRI) to resolve “coupled shock", which is one of the most crucial issues concerning the initialization with uncoupled ocean and atmosphere data assimilation systems in SI forecasting. This chapter is organized as follows. Section 2 summarizes the process of developing ocean data assimilation systems for up-to-date ENSOmonitoring and SI forecasting. In particular, it describes efforts to improve the Temperature-Salinity (T-S) balance in the assimilation results over the past decade. We then introduce the ocean data assimilation system used in the JMA seasonal and ENSO forecasting system as a state-of-the-art system in Section 3. Section 4 demonstrates the importance of assimilating ocean observation data through the ocean data assimilation system for ENSO and seasonal forecasting. Section 5 introduces the recent effort to resolve the coupled shock at JMA/MRI. This chapter is summarized in Section 6. Assimilating Ocean Observation Data for ENSO Monitoring and Forecasting


First and second generations of ocean data assimilation systems
There have been two innovative efforts in the development of ocean data assimilation systems for ENSO monitoring and SI forecasting, and both of them occurred following material changes of the ocean observing system. The first effort followed the deployment of the Tropical Atmospheric Ocean (TAO) array (Hayes et al., 1991;McPhaden et al., 1998) under the Tropical Ocean and Global Atmosphere (TOGA) program (e.g., Ji et al., 1995). This change initially realized the operational monitoring of the ocean interior state in the equatorial Pacific using an ocean data assimilation system. However, since major oceanic data observed by Autonomous Temperature Line Acquisition System (ATLAS) buoys (mooring buoys that form the TAO array) were temperature profiles, ocean data assimilation systems developed in this early stage adopted the Optimal Interpolation (OI) or Three-Dimensional Variational Method (3DVAR) assimilating only temperature profiles. Salinity profiles were abandoned in those systems, although their number was not large. It should also be noted that these data assimilation systems directly modified only the model temperature fields; the model salinity field was modified only through adjustment by model physics. Development of schemes assimilating Sea Surface Height (SSH) data started after the launch of the TOPEX/Poseidon satellite, but it still focused on modifying the model temperature field for the first several years (e.g., Ji et al., 2000). Some studies, however, pointed out that appropriate treatment of salinity is essential for controlling model fields realistically, for the following two reasons. One is the effect of salinity variations on SSH and pressure fields through contribution to density variations. This issue was addressed by Cooper (1988); Ji et al. (2000); Maes (1998). In particular, Ji et al. (2000) demonstrated that the use of observed SSH in addition to in situ temperature data can increase analysis errors in the systems analyzing temperature alone without considering the salinity anomaly, because the spurious temperature anomaly is estimated in order to compensate the contribution of salinity to SSH. The other reason is the possibility of destroying density stratification, which was first addressed by Woodgate (1997). Temperature and salinity decrease with depth in many areas of the tropical and subtropical oceans. The upward shift of water mass induces cold and low salinity anomalies in those areas. When assimilating temperature alone, the cold anomaly is reproduced but the low salinity anomaly is missed, resulting in the simulated greater density exceeding the actual density. This artificial density tends to destabilize the stratification and to induce spurious vertical mixing. This effect severely diminishes the salinity maximums in the subsurface layers in the tropical oceans (e.g., Troccoli et al., 2002). Many studies have also stressed the importance of salinity variability in the equatorial Pacific. The surface salinity front existing in the western equatorial Pacific is considered a good indicator of the eastern edge of the warm water pool, and its position is directly connected to the advective-reflective oscillator theory (e.g., Picaut et al., 1997). Roemmich et al. (1994) suggested that the salinity front affects the current fields in the near-surface layer. Formation of the barrier layer (the isothermal layer in which salinity is stratified; Lukas & Lindström, 1991) is another important feature associated with salinity there. This barrier layer can affect surface currents by thinning the mixed layer and concentrating the effect of wind stress (e.g., Vialard et al., 2002). It is also considered to induce a temperature rise in the mixed layer because it prevents warm water at the surface from mixing with cold water in the thermocline. This tendency is confirmed in the equatorial Pacific (e.g., Ando & McPhaden, 1997;Fujii et al., 2012;Maes et al., 2006). Some studies (e.g., Maes & Belamari, 2011) have further used CGCMs to confirm the impact of the barrier layer at the onset of El Niños.
In addition, the number of Argo floats rapidly increased after 2000, in order to meet the requirements of the Global Ocean Data Assimilation Experiment (GODAE) project (Clark et al., 2009). This second material change in the observing system urged to develop schemes for assimilating observed salinity profiles effectively in ocean data assimilation systems. In 2000, the name of the TAO array was also changed to TAO/Triangle Trans-Ocean Buoy Network (TRITON) array to designate the upgrading of ATLAS buoys west of 160 • EtoTRITONbuoys that contain more enriched equipment (Kuroda, 2002). However, many in situ observation platforms still exist, including the TAO/TRITON array, which provide little salinity data. In order to use those observation data effectively along with the data of simultaneous observations of temperature and salinity by Argo floats and other sophisticated platforms, it is necessary to develop a method to guarantee consistency between temperature and salinity fields, even when only a small amount of salinity observation data exist. Thus, for the past decade, various schemes estimating salinity fields mainly from temperature and SSH through the T-S balance relationship have been developed based on OI or 3DVAR (summarized in Fujii et al., 2010). For example, salinity fields are estimated using regression coefficients of their anomaly with the SSH anomaly, as well as temperature fields by Ezer & Mellor (1994); Kamachi et al. (2001). Some studies (e.g., Carton et al., 2000;Huang et al., 2008;Yan et al., 2004) uses climatological T-S relations typically calculated from the World Ocean Atlas (e.g., Locarnini et al., 2010). Vertical shifts of water masses in background (model) T-S profiles are applied in Haines et al. (2006); Ricci et al. (2005). Coupled T-S Empirical Orthogonal Function (EOF) modal decomposition is also employed to reconstruct vertical temperature and salinity profiles in Dobricic et al. (2005); Fujii & Kamachi (2003); Maes et al. (2000). Thus, most current operational ocean data assimilation systems adopt OI or 3DVAR and have the capacity to assimilate observed salinity profiles imposing a multivariate (mainly T-S) balance relationship. These systems are called "second-generation" systems in , while systems that can assimilate temperature data alone are called "first-generation" systems. Balmaseda et al. (2011) demonstrated an example of improved ENSO forecasting from the first generation to the second generation. In the following section, we also confirm the advantage of second-generation systems. It should be noted that data assimilation systems adopting more sophisticated schemes (e.g., ensemble Kalman filter or the adjoint method) have also been developed or started to use in operation (e.g., Keppenne et al., 2005;Weaver et al., 2003). These systems have the potential to improve the accuracy of monitoring and forecasting skills further. 10 m. A generalized enstrophy-preserving scheme and a scheme that involves the concept of diagonally upward/downward mass momentum fluxes along the sloping bottom (Ishizaki & Motoi, 1999) are applied for momentum advection. Isopycnal diffusion (Redi, 1982), isopycnal thickness diffusion (Gent & McWilliams, 1990), and the vertical mixing scheme of Noh & Kim (1999) are also adopted. The analysis scheme MOVE adopts the 3DVAR method using vertical coupled T-S EOF modal decomposition for the background error covariance matrix. In this scheme, correction of temperature and salinity profiles to their first guess values, ∆x p , is represented by a linear combination of the T-S EOF modes as follows: where w l is the amplitude of the lth EOF mode, λ l is the singular value for the lth mode, S is the diagonal matrix composed of the standard deviations of temperature and salinity from the first guess, and u l is the vector representing the lth mode. It should be noted that the vector ∆x p contains the corrections of temperature and salinity at all model levels as its elements. In MOVE-G, the model domain is partitioned into 40 horizontal subdomains, and the T-S EOF modes are calculated for each subdomain from historical profile data in the World Ocean Database 2001 (WOD01; Conkright et al., 2002) and the Global Temperature-Salinity Profile Program (GTSPP) database (Hamilton, 1994). The subdomains overlap in boundary areas. Correction of temperature and salinity profiles in the overlapping areas is calculated as a weighted sum, where x p is the analysis of the profiles, x f p is the first guess of the profiles, m is the counter of the subdomains, ∆x p,m is the correction based on the EOF modes of the mth subdomain, and a m is the weight of the mth subdomain that satisfies ∑ m a 2 m = 1 for avoiding the loss or gain of total variances by the area partition (Fukumori, 2002). The mode amplitudes, w l in (1), are determined to minimize the cost function. The cost function, J(w),isdefinedas where w is the vector containing the amplitudes at all horizontal grid points for all subdomains, and w m,l is the partial vector of w containing the amplitudes of the lth mode in the mth subdomain. The matrix B m represents the horizontal correlation of background (first guess) errors for the mth subdomain modeled by the Gaussian function. The vector y is composed of temperature and salinity observations, and y h is satellite SSH data. Vector x = x f + Gw is the state vector of temperature and salinity analysis fields composed of x p , where x f is the first guess and G denotes the transformation represented by (1) and (2). Matrix H represents spatial interpolation for acquiring the values equivalent to the temperature and salinity observation, and H h is the nonlinear operator that includes calculating the Sea surface Dynamic Height (SDH) from gridded temperature and salinity data and interpolation. Matrix R (R h ) is the error covariance matrix for the temperature and salinity profiles (satellite SSH data). The term J add is the additional nonlinear constraint for avoiding density inversion . The gradient of the cost function is written as where H * h is the adjoint code of the nonlinear operator H h . We also apply the variational quality control procedure introduced in Fujii et al. (2005). The cost function is non-quadratic for w, and the calculation of g includes inversion of the non-diagonal matrix B m .I nMOVE-G, we adopt the preconditioned quasi-Newton method introduced in Fujii (2005) to minimize this non-quadratic function without directly implementing the inversion. The 3DVAR analysis explained above is performed once every assimilation cycle using all available observations in the term of the cycle, and the result is reflected in the model fields by Incremental Analysis Updates (IAU: Bloom et al., 1996). The first guess for the analysis is given as a weighted mean of the climatology and the model-prediction for the middle time of the cycle from the assimilation result at the end of the previous cycle. The difference between the analysis and the model-prediction (i.e., analysis increment) is applied to correct the temperature and salinity fields in the model. Current fields are adjusted to the corrected temperature and salinity fields through the model dynamics, and thus establish the geostrophic balance in most areas. An online model-bias estimation using the one-step bias-correction algorithm (Balmaseda et al., 2007) can be applied with IAU in MOVE-G. The bias estimates are subtracted from the model-prediction fields before calculating the first guess, and are updated by taking a weighted mean of its original and analysis increment in every assimilation cycle. In addition, the SSH change due to the variation of the total fresh water mass in the global ocean, which is not taken into account in the OGCM, is estimated in the assimilation runs in Section 4. In the estimation, this globally constant value is regarded as a control variable of 3DVAR. This value is added to SDH calculated from the temporal analysis fields before subtracting the observed SSH. The term of the background constraint for the SSH change is also added to the cost function (3). The estimation in the analysis in the previous assimilation cycle is adopted as the first guess. We set a small value for the prescribed error variance of the first guess in order to vary the value slowly. In situ temperature and salinity profiles, satellite SSH anomaly data, and observation-based gridded Sea Surface Temperature (SST) data are assimilated into the model in MOVE-G. The temperature and salinity profiles employed in the assimilation runs in this study are collected from WOD01, GTSPP, and the data of the TAO/TRITON array. Profiles of Argo floats are included in GTSPP. The SSH data is the along-track data from TOPEX/Poseidon, Jason-1, ERS-1/2, and ENVISAT, extracted from Ssalto/Duacs delayed-time multimission altimeter products (CLS, 2004). Centennial in-situ Observation-Based Estimates of the variability of SST and marine meteorological variables (COBE-SST; Ishii et al., 2005), or the gridded SST data compiled in JMA, are also assimilated in the assimilation runs in Sections 4 and 5.
In the previous section, we indicated that state-of-the-art (second-generation) operational ocean data assimilation systems adopt schemes in which consistency between temperature and salinity is assumed. MOVE-G also improves the accuracy of the salinity fields by establishing an adequate relationship between temperature and salinity through the coupled T-S EOF modes. In particular, variation of salinity coupled with that of temperature can be estimated through the coupled T-S EOF modes, even if little salinity data is available (Fujii & Kamachi, 2003). Here, we compare the temperature and salinity fields in three assimilation runs in order to demonstrate the difference between the first-and second-generation ocean data assimilation systems. One is the assimilation run named MOVE-G VAL in Fujii et al. (2012). The atmospheric reanalysis dataset produced by the National Center for Environmental Prediction and the National Center for Atmospheric Research (NCEP-R1; Kalnay et al., 1996) is employed as the external forcing in MOVE-G VAL. Online model-bias estimation is applied in this assimilation run. The length of the assimilation cycle is set to one month. The gridded SST data are not assimilated. In addition, several profiles of temperature and salinity are excluded from the assimilated data in order to use them as independent reference data. The excluded profiles are those observed by TRITON buoys positioned at 5 • N-156 • E, 0 • -156 • E, and 5 • S-156 • E; the profiles of Argo floats whose last digit of the World Meteorology Organization (WMO) ID is "4"; and the profiles whose position and date are similar to those of one of the above excluded profiles (within 0.1 • in longitude and latitude, and on the same date). In another assimilation run, MOVE-G 1GE, only temperature observations are assimilated, and salinity increments estimated from the temperature data through T-S EOFs are not applied to correct the model salinity field. Thus, the model salinity field is just adjusted to the corrected temperature field through the model physics in MOVE-G 1GE. This run is equivalent to those of the first-generation systems, while MOVE-G VAL can be considered an assimilation run of the second-generation system. Observation data other than temperature are not assimilated in the other run, MOVE-G 2GE-T, either; however, the salinity increments calculated through T-S EOFs are applied to correct the model salinity fields there. The settings of the assimilation system for MOVE-G 1GE and MOVE-G 2GE-T, other than those described above, are the same as those for MOVE-G VAL. The observation data withheld in MOVE-G VAL are also withheld in MOVE-G 1GE and MOVE-G 2GE-T. We analyze the results of these assimilation runs in the period of 1993-2008. Validation of salinity using the data observed by TRIRON buoys deployed along 156 • E ( Fig. 1) clearly indicates that using T-S EOFs improves the salinity field even if salinity observations are not assimilated. The subsurface salinity maximum is apparently diminished at 0 • and 5 • S in MOVE-G 1GE. The variations of near-surface fresh water at these positions are not estimated satisfactorily in this run. The subsurface salinity maximum is smoothed out vertically at 5 • N. These errors stem from the spurious vertical mixing induced by the breakdown of the T-S relation due to modifying the temperature field alone (Troccoli et al., 2002). The subsurface low salinity biases at 0 • and 5 • S are removed, and the variations of the surface fresh water are fairly estimated in MOVE-G 2GE-T due to the T-S EOFs. The subsurface salinity at 5 • N is also improved, although the contrast of the subsurface salinity between 0 • and 5 • N is slightly underestimated. Assimilating salinity and SSH improves the salinity field further. The subsurface salinity contrast between 0 • and 5 • N is improved, and appearance of water whose salinity exceeds 35.4 at 0 • is properly estimated in MOVE-G VAL. It should be noted that the observation data of these TRIRON buoys are not assimilated as well in MOVE-G VAL, as in the other two assimilation runs. Figure 2(a) indicates that MOVE-G 1GE has a colder temperature field than MOVE-G VAL in the entire equatorial Pacific. This colder temperature is also induced by spurious vertical mixing due to modification of only the model temperature field. We further calculate statistics for the accuracy of temperature fields in the equatorial Pacific for MOVE-G 1GE and MOVE-G VAL. The reference for the statistics is the data of the profiling floats in 2 • S-2 • N, 130 • E-80 • W that are withheld in the assimilation runs. The result (Table 1) reveals that spurious mixing actually causes a cold bias. Table 1 also indicates that imposing a T-S balance relationship in the second-generation system resolves this problem. The difference between the mean temperature field of MOVE-G 2GE-T and that of MOVE-G VAL ( Fig. 2(b)) also implies that imposing the T-S relationship effectively suppresses spurious mixing and reduces the cold bias. Smaller Root Mean Square Differences (RMSDs) and larger Anomaly Correlation Coefficients (ACCs) suggest that not only the mean state but also the variability of the temperature field in the equatorial Pacific is improved in MOVE-G VAL. The accuracy of temperature and salinity fields in the equatorial Pacific is thus improved in second-generation systems, as a result of imposing the T-S balance relationship. Bias  (Locarnini et al., 2010). Values equivalent to the reference for the assimilation runs are calculated by spatial and temporal interpolations from the monthly mean data.

Impact of ocean observation data on ENSO forecasting
In this section, we demonstrate the benefits of assimilating ocean observation data through the ocean data assimilation system in ENSO and seasonal forecasting. The impact of oceanic temperature and salinity observed by the TAO/TRITON array and profiling floats (most deployed as Argo floats) on the SI forecasting is examined in order to show the benefits. Temperature profiles of the TAO array were clearly the most influential data on ENSO forecasting before 2000. However, the recent increase of Argo floats is likely to reduce the influence of the TAO/TRITON array, since both observe the subsurface temperature field in the tropical Pacific.
Possibly, oceanic temperature and salinity information from the TAO/TRITON array and profiling floats is mostly redundant. If so, part of the cost for the observing platforms is abandoned. Therefore, showing the complementary effects of the array and the floats is vital for sustaining these observation platforms. In particular, the impact of the observing platforms on SI forecasting is important information for administrators of the observing system, because SI forecasting is one of the most influential products of ocean observation data.
The complementary impacts of TAO/TRITON array and profiling floats have already been demonstrated by an Observing System Experiment (OSE) using the seasonal forecasting system of the European Centre for Medium-Range Weather Forecasts (ECMWF) (Balmaseda & Anderson, 2009). The results of OSEs, however, depend greatly on the forecasting system. Therefore, we also perform OSEs continuously in order to examine the impact of assimilating oceanic temperature and salinity data of the TAO/TRITON array and profiling floats through MOVE-G in the JMA seasonal and ENSO forecasting system. It should be noted that this study does not evaluate the atmospheric data and oceanic current data of the TAO/TRITON array. Our previous results are briefly introduced in . In this section, we introduce our recent OSE results. The OSE configuration is as follows. First, we perform three ocean data assimilation runs (ALL, XTT, and XAF) from 2000 to 2008 using MOVE-G. All available data in the regular observation dataset are assimilated in ALL. Data of the TAO/TRITON array are withheld in XTT, and data of the profiling floats are withheld in XAF. The three assimilation runs use the same setting except for the assimilated data. The period of the assimilation cycle is 10 days, the same as in JMA's operational system. The Japanese 25-year Reanalysis (JRA-25; Onogi et al., 2007) is used for estimating the atmospheric forcing. The original period of JRA-25 is 1979 to 2004, but it is extended to the present time by adding the product of the JMA Climate Data Assimilation System (JCDAS). The online model-bias estimation is not applied here. However, we applied the estimation of the SSH change due to the variation of the total fresh water mass in the global ocean. COBE-SST is assimilated in all the assimilation runs. Thirteen-month 11-member ensemble forecasts are then performed with JMA/MRI-CGCM, the CGCM adopted in the forecasting system (Takaya et al., 2010;Yasuda et al., 2007), from 31 January, 26 April, 30 July, and 28 October during the 5 years of 2004-2008. We thus have 20 forecasts. Analysis fields of ALL, XTT, and XAF are used as the initial conditions of the ocean component in these forecasts. The atmospheric fields of JRA-25 are used for the initial condition of the atmospheric component. The flux correction procedure used in the JMA operation is applied in all forecast calculations. In order to generate ensemble members, we put small perturbations on the observation data of the gridded SST in the 10-day assimilation cycle just before the start of a forecast. Although we applied perturbations in the same shape as the regression map of SST to the NINO4 index (see Table 2), we found in a preliminary study that forecast results are not sensitive to the shape of the perturbation if the scale is as small as we applied in this study. We calculate the ensemble mean of horizontal monthly mean fields for several atmospheric and oceanic parameters from all the forecast results and convert them to data in a 2.5 • -interval grid, although only the results for SST, Sea Level Pressure (SLP), Velocity Potential on the 200 hPa surface (VP200), and Outgoing Longwave Radiation (OLR) are presented here. Forecast biases are estimated by averaging the deviations of forecast values from their references separately for each lead time; each forecast month; and each of ALL, XTT, and XAF. The forecast values are then calibrated by subtracting the corresponding biases. Thus, each set of forecasts for ALL, XTT, and XAF is separately calibrated. It should be noted that a forecast value is employed to calibrate itself. Although this procedure is not fair for evaluating a forecast skill, we adopt it because of the shortage of forecasts. We assume this procedure does not affect the conclusion of this study. We use COBE-SST for the reference of SST, the National Oceanic and Atmospheric Administration (NOAA) OLR (Liebmann & Smith, 1996) for the reference of OLR, and JRA-25 for the reference of the other atmospheric parameters (SLP and VP200). Table 2. Definition of the areas in which the SST anomaly is averaged for calculating SST indices.
First, we analyze the impact of the TAO/TRITON array and profiling floats on the forecasts of the area-averaged SST indices. The indices are calculated by averaging the anomaly of monthly mean SST from the monthly climatology of the reference data in the areas defined in Table 2. The impact of the TAO/TRITON array (profiling floats) is evaluated through the difference in ACCs with the reference between the forecasts from ALL and XTT (XAF) (i.e., the increase of ACC if the data of the array (floats) are assimilated in addition to the other data in the regular observation dataset).   Figure 3 also indicates that 8-13M LT forecasts of SST are greatly improved by assimilating the float data in the eastern and central equatorial Pacific, except NINO1+2. The increase of ACC is more than 0.1 for NINO3.4 and NINO4, and about 0.09 for NINO3, although its impact is as large as that of the TAO/TRITON array for 1-7M LT forecasts in these areas. The longer lead time ENSO forecasts are likely to be effectively improved by the better subsurface temperature fields in the whole tropical Pacific, due to the assimilation of the float data.
Assimilating float data has a negative impact on NINO1+2. The impact is also negative in the western tropical Pacific (the TRITON and Philippine area) for 1-7M LT forecasts although it becomes positive for 8-13M LT forecasts there. Assimilating float data also increases ACCs of SST in the eastern Indian Ocean (IODE) more than 0.1 and improves ACC of the western Indian Ocean (IODW) about 0.03 for 1-7M LT forecasts. Although the large impact on IODE remains for 8-13M LT forecasts, the skill of forecasting the SST index for IODW is severely degraded for 8-13M LT forecasts. Thus, the oceanic data of both the TAO/TRITON array and profiling floats generally have positive impacts on the SST forecasts in the equatorial Pacific. Here, the positive impacts of these oceanic data can be regarded as complementary because they mean that the forecast skills are improved by assimilating TAO/TRITON (float) data in addition to the float (array) and other regular data. They also suggest that adding temperature and salinity profiles to the current regular observation data (including TAO/TRITON and float data) may further improve the forecast skills. These positive impacts also affect the forecast skills of the atmospheric state. Figure 4 indicates the impact of assimilating data of the TAO/TRITON array or the profiling floats on SLP, VP200, and OLR for 1-7M LT forecasts. Forecasts of SLP fields are remarkably improved by assimilating TAO/TRITON array in the central and eastern tropical Pacific, with a direct link to the forecast improvement of the SST anomaly that indicates EL Niños and La Niñas. It also improves SLP in the Indian Ocean and south of Japan probably due to the remote effects of ENSO. VP200 is also improved by TAO/TRITON data in a wide area, particularly over North America and the area extending from northern China to the Philippine Sea. Figures 4(d,  e) indicate that assimilating float data also has a positive impact, very similar to that of assimilating TAO/TRITON data, on the forecasts of SLP and VP200. The increase of ACC for VP200 is caused by better representation of the vertical air mass transports associated with precipitation in the tropics. Figures 4(c, f) indicate that assimilating the data of the TAO/TRITON array or profiling floats improves the variation of OLR, which is considered a proxy of precipitation, in a wide area over the Pacific including the western subtropical North Pacific south of Japan, around the maritime continent and Australia, and in the tropical and subtropical South Indian Ocean. Improvement of the divergence fields in the upper troposphere (VP200) leads to better upper-tropospheric wind fields and adequate representation of the global-scale atmospheric circulation. Thus, increasing the amount of ocean data for assimilation has a possibility to improve the forecast skills of the atmospheric state globally. It should, however, be noted again that the impact of observation data in an OSE is highly dependent on the forecasting system (i.e., the forecasting model and the data assimilation scheme). In addition, it depends on the target phenomena of the OSE. Therefore, we should examine the impacts of ocean observation data on variety of targets using various forecasting systems in order to evaluate the values of the data appropriately and to optimize the ocean observing system. For that purpose, delayed-mode OSEs of ocean observation data for weekly to decadal time scales, including SI forecasts, are encouraged by the observing system evaluation task team of the GODAE Ocean View project , see also https:// www.godae-oceanview.org/science/task-teams/observing-system-evaluation-tt-oseval-tt/). The task team plans to compile the results of OSEs and release them as the observation impact statement in order to share the values of ocean observations with the public and the administrators of observation platforms.

Constraining a coupled model using ocean data assimilation: an effort to resolve the coupled shock
In the previous section, we demonstrated the benefits of assimilating ocean observation data through ocean data assimilation systems in ENSO and seasonal forecasting. However, it should be noted that ocean analysis/reanalysis fields calculated by ocean assimilation systems generally have some inconsistency with atmospheric analysis/reanalysis fields because they are not calculated simultaneously in a unified data assimilation system. This inconsistency induces so-called the "coupled shock" when the ocean and atmospheric fields are used as the initial condition of a CGCM in an ENSO and seasonal forecasting system, and possibly reduces the improvement of the forecast skill due to the assimilation of the ocean observation data.
An essential method of resolving the coupled shock is developing a coupled data assimilation system, that is, a system in which atmosphere and ocean observation data are employed to constrain a CGCM. Information in the ocean observation data are extracted more effectively in a coupled data assimilation system, because it can improve not only oceanic but also atmospheric analysis/reanalysis fields. Contemplating the benefits described above, coupled data assimilation systems were developed by Zhang et al. (2007), based on an ensemble Kalman filter, and by Sugiura et al. (2008) based on an adjoint method. However, developing a coupled data assimilation system requires tremendous human effort and a heavy computer burden. The difference in the major time-scale between the atmosphere and ocean is another challenge. In order to deal with the coupled shock, we developed a "quasi-coupled data assimilation system" that uses only ocean observation data to constrain the ocean component of a CGCM (atmospheric observation data are not assimilated). This system is named MOVE-C. MOVE-C adopts the coupled model for JMA's SI forecasting, JMA/MRI-CGCM, and we apply the same procedure of the ocean data assimilation scheme as in MOVE-G since the oceanic part of the CGCM is the same as the OGCM adopted in it. We assume that slow components (i.e., climate variabiliy) can be subtracted from the full variability of the coupled atmosphere and ocean system by assimilating only ocean observation data in MOVE-C. It is a proto type of a truly coupled data assimilation system that we intend to develop in the future. In order to examine the feasibility of the quasi-coupled data assimilation system, we conduct a five-member ensemble assimilation run in the period of 1979-2008 using MOVE-C. Here, we apply the online model-bias estimation, the length of the assimilation cycle is set to one month, and COBE-SST is assimilated. It should be noted that a single-member assimilation run was used in the analysis in Fujii et al. (2009). We assume that using a five-member ensemble increases the reliability of the analysis, compared to the previous study. We also conduct a five-member ensemble of the Atmospheric Model Intercomparison Project (AMIP) runs, or simulation runs of the atmosphere model used in MOVE-C with the observed daily SST data (COBE-SST) as the oceanic boundary condition, in the same period. In addition, we use a free simulation run of JMA/MRI-CGCM, the CGCM used in MOVE-C. The simulation started from an assimilation result of MOVE-G and JRA-25 on 1 January 2000, and the integration is performed for 101 years. We use the simulation result of the last 60 years here. We also use COBE-SST, NOAA OLR, and JRA-25 as the reference for SST, OLR and the other atmospheric parameters. All datasets are converted to monthly mean with a grid spacing of 2.5 • before calculating statistics and indices in this section.
In the previous study, we demonstrated that the precipitation field was improved in the tropics in a single-member assimilation run of MOVE-C over an AMIP run. First, we reconfirm the improvement using the five-member ensemble run. In order to compare the accuracy of the MOVE-C with that of the AMIP run, we calculate ACCs between OLR (a proxy of precipitation) in those runs and reference data. The three-month running mean of the deviation from the monthly climatology of each run or reference is adopted as the anomaly. Figure 5 indicates that ACC for OLR is improved in the western equatorial Pacific and Philippine Sea, around the maritime continent, and in a wide area of the Indian Ocean. ACC is also increased around the Himalayas. This result is consistent with that of the previous study ). In the previous study, we suggested that the precipitation field is deteriorated with the absence of negative feedback between SST and precipitation in the AMIP run. In the real world, warm (cold) SST tends to increase (decrease) precipitation, while enhanced (suppressed) convection tends to induce an SST drop (rise) because of the cloud cover and the condition of ocean mixing. This negative feedback does not work appropriately in the AMIP run because SST is prescribed. Our analysis suggests that this defect of the AMIP run is likely to suppress the atmospheric response to ENSO. Figure 6 presents the distribution of simultaneous correlation coefficients of the anomaly of VP200 with the NINO3 index. It should be noted that the precipitation field is directly coupled with the vertical air mass transportation and therefore with the divergence in the upper troposphere (a negative value of VP200).
In an El Niño period (when the NINO3 index is positive), increased precipitation in the central equatorial Pacific induces a divergent anomaly in the upper troposphere (negative anomaly of VP200) there, while decreased precipitation in the western equatorial Pacific and maritime continent induces a convergent anomaly (positive anomaly of VP200) there. Thus, the correlation coefficients are positive around the maritime continent and negative over the central equatorial Pacific. Comparison of the correlation coefficients of the AMIP run ( Fig. 6(b)) with those of the reference data ( Fig. 6(a)) indicates that both negative and positive correlations are underestimated. This weak response of the atmosphere to ENSO probably occurs for the following reason. Precipitation (a drought) tends to decrease (increase) SST in the real world. However, precipitation is likely to be underestimated (overestimated) if the decreased  The anomaly is the deviation from the monthly climatology, and the three-month running mean is applied for both VP200 and the index. The correlation coefficients averaged for ensemble members are adopted for the MOVE-C and AMIP runs.
(increased) SST is given as the oceanic boundary condition in the AMIP run. Thus, the response of the atmosphere to SST through precipitation is suppressed in the AMIP run. This problem is resolved by taking into account the air-sea interaction through a CGCM. If the decreased (increased) SST suppresses (enhances) precipitation, the model increases (decreases) SST by the negative feedback, and augments (reduces) precipitation. The precipitation is thus kept at the appropriate level. Figure 6(c) confirms that the negative correlation of VP200 with the NINO3 index over the central Pacific is adequately reproduced in the MOVE-C run, and the positive correlation over the maritime continent is improved in the run over the AMIP run, although it is slightly overestimated reflecting the inherent property of the CGCM that is implied by the map for the CGCM free run (Fig. 6(d)). The overestimated response to ENSO in the free run may be caused by the absence of part of the natural variability due to lacks of some model physics.
The improved response of the atmosphere to ENSO is likely to improve the variability of VP200 (Fig. 7). In the MOVE-C run, the area where ACC exceeds 0.8 spreads wider than in the AMIP run over the maritime continent. ACCs also apparently increase over the western Indian Ocean, East Asia, western North Pacific, and east of the Hawaii Islands. In the MOVE-C run, the accuracy of SLP is also improved over that in the AMIP run. Figure 8 indicates the distribution of ACCs of SLP with the reference for boreal winter (December-February), spring (March-May), summer (June-August), and fall (September-November). The ACCs increase over the Indian Ocean for all seasons in the MOVE-C run. We also find that the MOVE-C run generally has higher ACCs around the central tropical Pacific. Furthermore, the ACCs over the Philippine Sea exceed 0.8 and are much higher than those in the AMIP run in summer. The anomaly is the deviation from the monthly climatology, and the three-month running mean is applied for both SLP and the index. The correlation coefficients averaged for ensemble members are adopted for the MOVE-C and AMIP runs.
These improvements are also due mainly to the better response of the atmospheric fields to ENSO. Figure 9 depicts maps of the four-month and seven-month lagged correlation of SLP with the NINO3 index. Since the peak of an El Niño is generally in December, the four-month and seven-month lagged correlations roughly represent the property of the spring and summer after El NIños. The correlation coefficients are too high over the Indian Ocean in the CGCM free run ( Fig. 9(d)). This spuriously high correlation may be due to insufficient representation of intrinsic variability associated with the Indian Ocean in the CGCM. In contrast, the correlation coefficients are underestimated over the Indian Ocean in the AMIP run ( Fig.  9(c)). Furthermore, the AMIP run has lower correlation coefficients than the reference over the western equatorial Pacific and Philippine Sea. These errors are mostly removed in the MOVE-C run (( Fig. 9(b)), resulting in improved SLP accuracy shown in Fig. 8. The better response of SLP to ENSO in the MOVE-C run is probably associated with improvement of the precipitation field over the South Indian Ocean, Philippine Sea, and western equatorial Pacific shown in Fig. 5. In Fujii et al. (2009), we indicated that the correlation between SST and precipitation is negative over the Philippine Sea and western equatorial Pacific, and around zero over the Indian Ocean in summer. This correlation is always spuriously high in the AMIP run, due to the lack of the negative feedback between SST and precipitation. This spurious correlation contaminates the variation of precipitation coupled to ENSO and reduces the correlation of SLP with ENSO in the AMIP run. This problem is mitigated by restoring the negative feedback by the CGCM in the MOVE-C run. The previous study ) also suggested that appropriate reproduction of the variability of the Walker Circulation and monsoon trough is a factor which improves precipitation over the Philippine Sea in summer. In this study, we compare the response of the Walker Circulation and the monsoon trough to ENSO among the MOVE-C, AMIP, and CGCM free runs. In order to represent the variations of the Walker Circulation and monsoon trough, we extend the W-Y index proposed by Webster & Yang (1992) and the DU2 index proposed by Wang & Fan (1999). index denotes that the Walker Circulation (monsoon trough) is stronger than the usual. Both indices are originally calculated from the 3-month mean field of the zonal wind in boreal summer (June-August), and, thus, represents the state of the Walker Circulation and monsoon trough only in summer. In this study, these indices are calculated for each month using the three-month running mean of zonal wind fields. Figure 10(a) demonstrates that the response of the Walker Circulation to ENSO is remarkably improved in the MOVE-C run over the other two runs. The plot of the reference data for the W-Y index indicates that the Walker Circulation is suppressed most around the peaks of the positive phases of ENSO (i.e., El Niño). The plot is almost symmetrical about the 0 lag line, and the minimum value reaches −0.7. The response of the Walker Circulation is underestimated in the AMIP run, probably due to the weaker response of the velocity potential over the maritime continent (Fig. 6). This weak response mitigates the change of the zonal gradient of the velocity potential and reduces the change of the zonal wind in the upper layer, resulting in moderation of the change of the Walker Circulation. In contrast, the response of the Walker Circulation is overestimated and slightly lagged in the CGCM free run due to the stronger response of the velocity potential.
In the MOVE-C run, the scale of the response is adequately estimated, and the plot is much closer to that for the reference, although the recovery of the Walker Circulation after the peaks is more rapid than the moderation before, as well as other two runs. The response of the monsoon trough to ENSO is also reproduced best in the MOVE-C run as shown in Figure 10(b). The DU2 index is slightly negative four months before the peaks of El Niños, and it reaches the minimum value, about −0.5, six month after the peaks. This  Fig. 10. Plots of the correlation coefficients of (a) W-Y index, (b) DU2 index, with the NINO3 index against the lag (month) of the W-Y or DU2 indices for the reference data (black), AMIP run (blue), MOVE-C run (red), and CGCM free run (purple). The coefficient is calculated for the NINO3 index in the period of 1980-2007. The correlation coefficients averaged for ensemble members are adopted for the MOVE-C and AMIP runs.
negative value represents the development of an anticyclone over the Philippine Sea after El Niños (eg., Wang et al., 2003;Xie et al., 2009). This property is recovered well in the MOVE-C run. In contrast, in the AMIP run, no positive correlation before the peak is estimated, the lag of the minimum correlation is shorter than the real one, and the minimum value is slightly underestimated. The shorter lag and the weaker response are probably caused by the spuriously high correlation between SST and precipitation disturbing the atmospheric response, particularly in summer. The improved response of the DU2 index in the MOVE-C run over the AMIP run is associated with the better response of SLP over the Philippine Sea ( Fig. 9) and the improved accuracy of SLP (Fig. 8). In the CGCM free run, the positive correlation before the peak of El NIños is overestimated and the negative correlation after the peaks is underestimated. Finally, MOVE-C improves responses of the atmospheric circulation, including the Walker Circulation and the circulation associated with the monsoon trough, over the AMIP run, resulting in a better SLP and upper-tropospheric velocity potential field. This improvement stems from restoring the negative feedback between SST and precipitation in MOVE-C. The feedback adjusts the response of precipitation on the oceanic near-surface temperature field at an adequate level. Thus, calculating the coupled air-sea process explicitly through a CGCM (which is not possible with an AGCM alone) mitigates the inconsistency between the ocean and atmosphere and improves the representation of climate variability, including the ENSO response of the atmosphere. This result demonstrates a benefit of assimilating observation data directly into a CGCM in a coupled data assimilation system.

Summary
This chapter highlights the essential role of the ocean data assimilation systems for ENSO monitoring and SI forecasting. Considerable efforts to improve the data assimilation systems and increased ocean observation data have brought about the recent brilliant development of ocean data assimilation systems, contributing to the improvement of SI forecasting.
Although ENSO and seasonal forecasting have been realized and sophisticated enough to be used operationally now, further improvements are earnestly desired to reduce the damage of climate disasters and to increase the efficiency of industry, agriculture, and fisheries. Further development of ocean data assimilation systems is a possible factor for such improvement. Although the innovative progress of the development thus far has followed material changes of the ocean observing system as described in Section 2, no other material change is likely to occur in the next few years. Instead, we assume that imposing the ocean-atmosphere balance relationship to mitigate the coupled shock is the key to further improvement of the assimilated fields and SI forecasting, just as imposing the T-S balance relationship was the key to progress from the first generation to the second generation of ocean data assimilation systems. In order to achieve such development, the use of a coupled atmosphere-ocean model is essential; thus, it is necessary to develop a system in which observation data are assimilated into a CGCM. In Section 5, we demonstrated a benefit of this strategy. However, the quasi-coupled data assimilation system introduced in section 5 is insufficient because it relies on adjustment in the coupled model integration for establishing atmosphere-ocean balance. Developing a scheme to assimilate atmospheric data is also desirable. Atmospheric data have an inherent potential to improve assimilated fields, as demonstrated in Balmaseda & Anderson (2009), although the difference in the major time-scale between the atmosphere and ocean makes the assimilation rather difficult. To adress these issues, several institutes, including JMA/MRI, have currently been developing truly coupled data assimilation systems, in which oceanic and atmospheric data are assimilated into a coupled model imposing the ocean-atmosphere balance relationship. We expect those systems to form the third generation of data assimilation systems for ENSO monitoring and SI forecasting. It should also be noted that a solid ocean observing system is essential for issuing reliable information on ENSO and seasonal forecasts. Therefore, sustaining current observing platforms, including the TAO/TRITON array and Argo floats, is crucially important, as well as proposing innovative and potential platforms. In order to sustain current platforms securely, it is necessary to demonstrate to the administrators the essential effects of those observation data in a readily visible manner as attempted in Section 4. In that sense, the activity of the GODAE Ocean View observing system evaluation task team (see the last of Section 4) should be seriously supported. In particular, we hope that the observation impact statement will have a substantial effect for the secure development of the ocean observing system, resulting in further improvements of ENSO monitoring and SI forecasting.