Seasonal-to-interannual prediction of North American coastal marine ecosystems_ Forecast methods, mechanisms of predictability, and priority developments

Marine ecosystem forecasting is an area of active research and rapid development. Promise has been shown for skillful prediction of physical, biogeochemical, and ecological variables on a range of timescales, suggesting potential for forecasts to aid in the management of living marine resources and coastal communities. However, the mechanisms underlying forecast skill in marine ecosystems are often poorly understood, and many forecasts, especially for biological variables, rely on empirical statistical relationships developed from historical observations. Here, we review statistical and dynamical marine ecosystem forecasting methods and highlight examples of their application along U.S. coastlines for seasonal-to-interannual (1–24 month) prediction of https://doi.org/10.1016/j.pocean.2020.102307 Received 25 June 2019; Received in revised form 17 February 2020; Accepted 18 February 2020 ⁎ Corresponding author at: Environmental Research Division, NOAA Southwest Fisheries Science Center, 99 Pacific St., Ste. 255A, Monterey, CA 93940, United


Introduction
Businesses, government agencies, and the public rely on weather forecasts for planning and responding to changing environmental conditions. While these forecasts are useful to ocean-related sectors, there is also high and increasing demand for ocean forecasts to support proactive decision making in the marine environment (e.g., for fisheries, aquaculture, shipping, energy, tourism, and public health). Forewarning of ocean conditions can help decision makers reduce negative impacts in unfavorable conditions and maximize opportunities in favorable conditions (Tommasi et al., 2017a;Hobday et al., 2016Hobday et al., , 2018. The rise of marine ecological forecasting research in recent years builds upon a substantial knowledge base and infrastructure in climate forecasting (Doblas-Reyes et al., 2013), and has been driven largely by increasing demand for decision-support tools to help marine stakeholders prepare for and adapt to ecosystem variability and change (Payne et al., 2017;Tommasi et al., 2017a) including extreme ocean conditions such as marine heatwaves (Holbrook et al., 2019), ecosystem health stressors , and the lack of sea ice in subarctic/arctic waters .
Marine policy and resource planning decisions are made based on forecasts and outlook information on various timescales. In contrast to climate projections, which are used to explore long-term trends based on various scenarios and mostly inform policy level decisions and longterm planning, forecasts for days to years aim to provide prognostic information of ocean conditions on the timescales relevant to marine resource managers and other end-users. A number of marine ecosystem forecasts have been cited in the literature (e.g., Payne et al., 2017), though many are in fact nowcasts or target short lead times (i.e., days). For example, nowcast tools that predict marine species distributions based on observed environmental conditions have been developed to better manage fisheries bycatch and ship-strike risks (Howell et al., 2008;Hazen et al., 2017Hazen et al., , 2018Welch et al., 2019a;Thorne et al., 2019) and three-day probability forecasts for blooms of the harmful algal bloom (HAB) forming phytoplankton Pseudo-nitzchia alert managers to potential shellfish toxicity that could necessitate fishery closures (Anderson et al., 2016). Similarly, coastal high water levels have historically been a more suitable target for short-lead prediction with a crucial focus on inundation impacts on timescales from hours to days. However, in many cases, longer lead times (i.e., 1-24 months) are better aligned with management and industry decisions. Forecast systems for southern bluefin tuna in Australia, with several months lead time, have been useful products for fishers to plan the start of their fishing season (Eveson et al., 2015) and for managers to set seasonal area closures based on impending ocean conditions (Hobday et al., 2011). In the eastern Bering Sea, preliminary seasonal predictions of the "cold pool" (i.e., the area where bottom temperature is < 2°C) were included in Ecosystem Status Reports from 2015 to 2018 to allow managers to consider fishing quotas in light of expected conditions (Siddon and Zador, 2018). Indeed, marine ecosystems are well suited to seasonal-to-interannual (S2I; 1-24 month) prediction given the convergence of ecosystem variability, physical drivers, and management and industry decisions that take place on these timescales. A number of efforts are underway to develop marine ecosystem forecasts for S2I timescales along the coasts of the United States (U.S.), leveraging improved technological capabilities, model development, and scientific understanding (Fig. 1, Table 1). The S2I timescale is the focus of this paper, though we briefly discuss shorter and longer timescales as appropriate.
The forecasts outlined above differ greatly in the models, observations, and assumptions upon which they rely, and the specific end-user needs they are meant to address. Of particular note for this paper, they also exploit a wide range of physical, biogeochemical, and ecological processes that give rise to predictability on different spatial and temporal scales. However, documentation of an apparently predictable response in the marine ecosystem often precedes a thorough understanding of the mechanisms underlying that predictability. For example, while recent studies have suggested multiple potential S2I  Table 1. Ocean bottom depth (m) is indicated in blue/white shading. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Table 1 Summary of seasonal-to-interannual (S2I) marine ecosystem forecasts along U.S. coastlines including key mechanisms of predictability. Corresponding locations are shown in Fig. 1

3.2.2)
Persistence, Gulf Stream/shelf interaction, equatorward advection, atmospheric teleconnections In development Nye et al. (2011), Lucey and Nye, (2010), Xu et al. (2015), Davis et al. (2017)  Species response to environmental change predictors for temperature and sea level on the Northeast U.S. Shelf, it is not yet clear how individual predictors are dynamically related and why statistical correlations with basin-scale climate indices hold up only over portions of the observational record (e.g., Li et al., 2014). Furthermore, while physical forecast systems have been developed largely from first principles, many predictions, particularly of biological quantities, rely on empirical statistical relationships whose underlying mechanisms are not well understood. Uncertainty in the drivers of observed relationships increases concerns of nonstationarity, i.e., that a seemingly robust relationship will not hold up over time (e.g., Myers, 1998) and forecasts based on it will fail as a result. To the extent that statistical models can be informed by mechanistic understanding of physical-biological relationships or other ecological responses (e.g., Section 4.2), they will likely perform better in a predictive capacity (Cuddington et al., 2013). The primary aims of this paper are to review forecasting approaches being developed and implemented for marine ecosystems, document mechanisms underlying predictability in these regions, and explore how these mechanisms can be exploited for marine forecasting efforts. We focus primarily on North American large marine ecosystems, but emphasize that many of our findings and recommendations are applicable to coastal regions around the globe. We focus on the S2I timescale; shorter and longer timescales are mentioned briefly and are of importance to marine decision making, but are outside the scope of this review. The forecasts and mechanisms addressed in this paper apply to physical, biogeochemical (chemistry, phytoplankton, and zooplankton), and ecological (higher trophic level) properties. We outline these properties in Section 2, and then detail dynamical and statistical forecast methods (Section 3), mechanisms of predictability (Section 4), and pressing challenges and priority developments (Section 5). Finally, concluding remarks including a summary of recommendations for future work are presented in Section 6.

Scope of marine ecosystem forecasts
Marine ecosystem research encompasses topics from physical atmosphere/ocean dynamics to coupled social-ecological systems, and marine ecosystem forecasts can be similarly diverse in their aims. For example, marine ecosystem forecasts are expected to aid in the implementation of ecosystem-based management, which requires quantitative methods, criteria, and thresholds to assess overall ecosystem status, evaluate trade-offs among ecosystem services, and guide management actions (Levin et al., 2009;Leslie and McLeod, 2007). To that end, forecasts are desired for various biotic and abiotic components of marine ecosystems in the water column and on the seafloor, as well as the structure and function of marine ecosystems and ecological communities (Levin and Schwing, 2011).
In this paper we address predictability associated with a range of physical, biogeochemical, and ecological properties for which prognostic information is beneficial to end-users, and for which the necessary observations are available to develop, initialize, and evaluate forecasts (Capotondi et al., 2019a). Atmospheric state variables of particular interest include near surface fields such as winds, pressure, temperature, and precipitation, the last of which is also used to estimate river outflow. Commonly targeted physical oceanic variables include sea level, derived quantities such as mixed layer depth, and state variables (e.g., temperature, salinity, currents) that may be 2D (surface, bottom) or 3D. Biogeochemical variables of interest (e.g., pH, aragonite saturation state, oxygen concentration, primary and secondary production) are similarly needed at different depth levels depending on the application, though depth-integrated values may also be sufficient in some cases. Metrics for physical and biogeochemical forecasts will also vary, as different applications may be concerned with the absolute state of the ecosystem, deterministic or probabilistic anomalies, or phenological changes. Finally, characteristics of living marine resources (LMR) are targeted in forecast applications that include improving the efficiency of fishing practices (e.g., Eveson et al., 2015;Kaplan et al., 2016;Turner et al., 2017), mitigating impacts of public health threats such as HABs (e.g., Stumpf et al., 2009), and alleviating human threats to protected species (Thorne et al., 2019). LMR occurrence, abundance, and distribution are some of the most common quantities forecasted in ecological applications (Table 1; Tommasi et al., 2017a, Payne et al., 2017, as they often respond to environmental variability in predictable ways.

Marine ecosystem forecasting methods
A number of dynamical and statistical approaches are available for S2I forecasts (Fig. 2), each with advantages and disadvantages. Dynamical prediction systems can resolve non-linearities in natural systems and use them to improve model skill, and can forecast unprecedented situations or nonstationarity in the climate system. In contrast, statistical prediction systems generally require long historical records for model training and development, and are not well equipped to Fig. 2. Summary of dynamical and statistical marine ecosystem forecasting approaches. Numbers accompanying individual arrows indicate corresponding sections in the text. Note that the Linear Inverse Model is an "empirical dynamical model", as it evolves the system forward in time, but is built on statistical relationships and for our purposes is included with statistical methods. Adapted from Capotondi et al. (2019a) anticipate unprecedented conditions or handle nonstationarity in physical-biological systems. However, statistical prediction systems are much less resource intensive than their dynamical counterparts, are not subject to biases that arise from model error in dynamical prediction systems, and offer an alternative for systems (especially in ecological applications) in which we lack the understanding needed to construct and parameterize dynamical models. Several statistical and dynamical forecast approaches are described in more detail below; given their complementary nature, both are needed for maximum realization of forecast skill.
3.1. Physical forecasting methods 3.1.1. Coupled global climate forecasts S2I forecasts from dynamical coupled prediction systems are now run routinely at multiple operational centers (Graham et al., 2011). Most commonly, dynamical prediction systems consist of several component models, each with detailed implementation of part of the earth system (i.e., atmosphere, ocean, sea ice, land), that are coupled to capture climate feedbacks and controls. A number of these prediction systems additionally include ocean biogeochemical fields such as carbon, nutrients, oxygen, phytoplankton, and zooplankton. Several national or international programs, including the North American Multi-Model Ensemble (NMME; Kirtman et al., 2014) and the WMO Lead Centre for Long-Range Forecast Multi-Model Ensemble (LC-LRFMME; Graham et al., 2011;www.wmolc.org), have been developed to consolidate predictions from individual modeling centers and enable investigation and use of multi-model ensemble forecasts. On timescales beyond one year (and up to 10 years), forecasting efforts are being advanced primarily through the World Climate Research Program (Kushnir et al., 2019), phases 5 and 6 of the Coupled Model Intercomparison Project, and nascent efforts at various operational centers (e.g., the Community Earth System Model Decadal Prediction Large Ensemble; Yeager et al., 2018). While they are not the focus of this paper, subseasonal (2 week -2 month) forecasts are similarly being actively developed through programs including the National Weather Service Global Ensemble Forecast System (Hamill et al., 2013), the Subseasonal Experiment (SubX; Kirtman et al., 2017;Pegion et al., 2019), and the Sub-Seasonal to Seasonal Prediction Project (S2S; Vitart et al., 2017).
Dynamical prediction systems have been extensively studied for their abilities to capture relevant phenomena and predictability sources for seasonal (1-12 month) timescales (Huang et al., 2014). For example, they have been shown to provide skillful predictions for largescale modes of climate variability including the El Niño-Southern Oscillation (ENSO; https://iri.columbia.edu/our-expertise/climate/ forecasts/enso/current/) and associated teleconnections (Trenberth et al., 1998), for monthly sea surface temperature anomalies (SSTa) in coastal Large Marine Ecosystems around the U.S. and elsewhere (Stock et al., 2015;Hervieux et al., 2017), for SST and precipitation anomalies over insular Hawaii and U.S.-affiliated islands (Annamalai et al., 2014), and for sea surface height (SSH) anomalies in much of the tropical Pacific Ocean including around U.S.-affiliated islands (Widlansky et al., 2017). However, the relatively coarse resolution of dynamical prediction systems is not well suited for capturing many complex fine-scale processes (e.g., freshwater discharge, tidal forcing, coastal upwelling) that are of interest for coastal marine ecosystem prediction. In these cases, skillful forecasts of large-scale modes of variability and their influence on regional features, together with statistical or dynamical downscaling techniques discussed in the following sections, may provide more useful information.
In addition to real-time forecasts, many climate prediction centers provide extensive sets of reforecasts (i.e., forecasts made for past periods using observations available at the time of the forecast) using the same model configuration as that used for the real-time forecasts. In the context of S2I predictions, reforecasts are crucial to (i) develop appropriate forecast calibration methods and correct for dynamical prediction system biases (which can be of the same order of magnitude as the quantities being predicted), (ii) assess the skill of the prediction system for specific applications, and (iii) determine and understand sources of predictability.

Dynamically downscaled regional forecasts
Dynamical downscaling allows one to leverage atmosphere and ocean state estimates from relatively coarse resolution ocean/atmosphere/climate models while better resolving fine-scale dynamics in a specific area of interest. In marine science, dynamical downscaling typically involves running a regional ocean model forced at the ocean surface and lateral boundaries by output from a global model (Fig. 2). In the case of retrospective analyses, boundary conditions are often derived from ocean/atmosphere reanalyses, while output from global climate models or earth system models is used to force dynamically downscaled climate projections (e.g., Auad et al., 2006, Sun et al., 2012, van Hooidonk et al., 2015, Xiu et al., 2018 and forecasts (e.g., Siedlecki et al., 2016).
In the case of S2I forecasts that are of interest here, dynamical downscaling has the potential to resolve fine-scale responses to predictable large-scale climate forcing. For example, global models can generally forecast ENSO-related wind stress anomalies in the Northeast Pacific with some skill, but they are unable to resolve the regional ocean response to those anomalies. In particular, anomalies of upwelling intensity and other key variables (e.g., temperature, pycnocline depth, nutrient fluxes) are greatest within tens of km of the coast; in global models these processes are sub-grid scale, resulting in signals that are too weak and diffuse . Similarly, global models with data assimilation can reasonably simulate open ocean variability in the Northwest Atlantic (such as Gulf Stream variability), while downscaling using a regional model is needed to accurately simulate the responses of Northeast U.S. Shelf circulation to the open ocean forcing (Chen and He, 2015). However, increased resolution alone does not correct all shortcomings of global forecasts. While regional models can resolve fine-scale anomalies, they rely on global models to impart the forcing that generates those anomalies. An important area of investigation is the degree to which signals that are poorly resolved in global models (e.g., coastal trapped waves, poleward undercurrents, buoyancy-driven coastal currents) can be transmitted through regional model boundaries into the domain where they are better resolved. Furthermore, regional models inherit not only skill but also bias from global models. In both of these respects, downscaled forecasts may benefit from global fields that are statistically corrected to remove biases and recover fine-scale variability at the lateral and surface boundaries of the regional domain (see Section 3.1.4).
There are several active efforts to produce dynamically downscaled seasonal ocean forecasts along U.S. coastlines (Fig. 1, Table 1). In the Northeast Pacific, these efforts include one for the Oregon/Washington shelf (JISAO's Seasonal Coastal Ocean Prediction of the Ecosystem; JSCOPE; Siedlecki et al., 2016), one for the California Current System using the UC Santa Cruz CCS configuration of the Regional Ocean Modeling System (ROMSe) (Veneziani et al., 2009), and one for the eastern Bering Sea using the Bering10K ROMS configuration (Hermann et al., 2013;Ortiz et al., 2016;Kearney et al., 2019). While each of these efforts shares a common overall approach, they differ in their details and intended applications (Table 1). Dynamically downscaled S2I ocean forecasts along the east coast are in a relatively nascent stage, partly due to the absence of a dominant interannual climate signal to impart predictive skill (as ENSO does along the U.S. west coast). Currently operational forecast systems for the east coast, e.g. the MARA-COOS models for the Mid-Atlantic Bight (http://assets.maracoos.org/), the U.S. Northeast Coastal Forecast System (http://fvcom.smast. umassd.edu/necofs/) and the coupled Northwest Atlantic Prediction System (http://omgsrv1.meas.ncsu.edu:8080/CNAPS/), target much shorter lead times, up to 72 h.

Statistical downscaling and forecasting
Statistical downscaling offers an approach for predicting marine ecosystem variability based on the underlying premise that regional ecosystem variability can be explained by large-scale climate dynamics and their interaction with regional influences including coastal orography, bathymetry, shelf-basin exchange, and characteristics of the land-sea interface. The approach generally involves developing quantitative relationships between larger-scale climatic variables (predictors), that are typically well predicted by climate and weather forecasting models, and local variables and conditions of interest (predictands) (Fig. 2). There is a large variety of statistical forecasting approaches, some of which are univariate, namely they relate the predictand to just one predictor (e.g., Davis et al., 2017), while others incorporate multiple predictors (multivariate approaches). In many cases, the predictor is a single variable representing some pattern of large-scale ocean and/or atmosphere variability, e.g., the Gulf Stream variability (Davis et al., 2017), the North Atlantic Oscillation (Xu et al., 2015) or ENSO (Faggiani Dias et al., 2018). Widely used multivariate approaches rely on multiple linear regressions for determining the relative influence of different predictors at a given lead time (e.g., Sanchez-Franks et al., 2016), but more advanced non-linear statistical techniques have also been used, particularly for predictions of atmospheric variables (e.g., Gaitan et al., 2014). Statistical downscaling has been successfully used for S2I forecasting of climate variables, to reduce uncertainty associated with application of downscaled climate information across multi-scale models, and for the development of future climate scenarios (Schoof, 2013;Wilby et al., 2004). In all cases, the potential of statistical methods for forecasting depends substantially on the availability of consistent, reliable observational data sets (e.g., in situ or satellite) that adequately capture large-scale atmosphere/ocean variability, regional oceanic responses, and relationships between the two.
One type of statistical downscaling used to better understand complex time-space mechanisms associated with weather variation and oceanic responses is synoptic climatology . Synoptic climatology represents a holistic approach in first categorizing atmospheric conditions into one of multiple modes of variability and then assessing the relationship between these atmospheric modes and an environmental outcome (Yarnal, 1993). The predictor variables are typically derived from global/regional reanalysis data sets and can include atmospheric properties near the earth's surface, such as sea-level pressure or winds, or higher in the atmosphere, such as 700 hPa or 850 hPa air temperature or geopotential height. The synoptic methodology serves as an effective statistical downscaling tool for output from weather forecasting models or global climate models (Schoof, 2013), and in some cases has proven to be superior to finer-scale modelgenerated atmospheric data (Wetterhall et al., 2009). Synoptic climatology is especially useful for forecasting on timescales of a week or longer, where models would be unable to render atmospheric features with sufficient precision for dynamic modeling, as in the case of coastal ecosystems where shelf-and estuarine-scale processes and air-sea interactions are primarily controlled by weather events and not clearly linked to global scale climate forcing (Sheridan et al., 2013, Pirhalla et al., 2015. A unique multivariate approach is Linear Inverse Modeling (LIM), where information on the statistics of the system is used to develop a model of the system with basic features that are independent of the forecast lead time. The LIM framework describes the system of interest in terms of an anomaly state vector, usually constructed from monthly anomalies of the key system variables, whose evolution is modeled in terms of linearly damped and stochastically perturbed dynamics (Penland and Sardeshmukh, 1995). The LIM framework has been  (Merryfield et al., 2013), (b) a 10 km regional ocean model simulation using the Regional Ocean Modeling System (ROMS) forced by CanCM4 fields that were interpolated directly to the ROMS grid, (c) a ROMS simulation forced at the surface and lateral boundaries by CanCM4 fields that were first bias corrected, and (e) the observed ocean state, as obtained from a ROMS CCS historical reanalysis (Neveu et al., 2016). (d) Monthly climatological SST in a central CCS region (34.5-43°N, 0-100 km offshore; black lines in a-c,e), with line colors corresponding to labels in other panels. extensively applied to the study of ENSO in the tropical Pacific (Penland and Sardeshmukh, 1995;Newman et al., 2009Newman et al., , 2011Capotondi andSardeshmukh, 2015, 2017) and for examining the prediction of North Pacific SSTa and the Pacific Decadal Oscillation (PDO, . Recent studies have shown the LIM to have seasonal forecast skill comparable to that of the NMME for conditions in the tropical Pacific (Newman and Sardeshmukh, 2017) and North Pacific (Faggiani Dias et al., 2018).

Hybrid statistical/dynamical approaches
Several forecasting efforts leverage both dynamical and statistical methods, and their respective strengths, to improve understanding of regional scale ocean responses to broad-scale forcing (Fig. 2). In one approach, atmospheric and/or oceanic fields from global models are statistically corrected over the region of interest before using them to force a regional ocean model. Statistical correction of the global fields may be as simple as bias correction, or may use more involved methods including canonical correlation analysis (e.g., Huth, 2002;Fowler et al., 2007) or multiple linear regression (e.g., Goubanova et al., 2011). At the ocean surface, statistically downscaling the wind forcing has been shown to improve the representation of ocean dynamics in Eastern Boundary Upwelling Systems by more accurately capturing features that drive environmental gradients near shore (Machu et al., 2015). If global fields are directly interpolated onto the regional model grid, the increased regional resolution alone will not necessarily correct biases inherited from the global model, a problem alleviated by statistical correction of the surface forcing (Fig. 3). Similarly, statistical downscaling of ocean conditions at the lateral boundaries can reduce the transfer of biases into the regional domain and may improve the transfer of features such as coastal trapped waves, which have an identifiable signature in global models even though they are not resolved.
A second approach aims to retain benefits of dynamical downscaling while reducing concerns related to computational expense and the limited availability of forcing fields. As in the synoptic forecasting approach described in the previous section, output from a small ensemble of dynamical downscaling runs is used to establish multivariate statistical relationships between the large-scale multivariate forcing and the small-scale multivariate response . Once such relationships are established, these relationships may be used to statistically project global forcing from a wider ensemble of models (to capture more structural uncertainty) and more realizations of each model (to capture more intrinsic uncertainty) onto the dynamically derived multivariate modes, effectively yielding a downscaled ensemble much larger than would be feasible with dynamical downscaling alone.

Dynamically downscaled regional forecasts
In a subset of global forecast systems (Section 3.1.1), biogeochemical variables are included in addition to physical ones, enabling the dynamical downscaling approach (Section 3.1.2) to be extended from physics to biogeochemistry. However, downscaling biogeochemical fields introduces the added challenge that variables available at the global scale and those in the regional model may not fully correspond or may be governed by a different set of equations. This mismatch comes primarily from the fact that even though most biogeochemical and lower trophic level ecosystem models adhere to a similar general formulation (i.e., nutrients-phytoplankton-zooplankton-detritus), the level of complexity with which each component or trophic level is represented typically varies (e.g., number of limiting nutrients or phytoplankton functional groups), as do the details of relationships between model components. In addition, the resolution of the models dictates the ability for the model to simulate the processes being resolved. For example, coarse resolution models cannot resolve shelf processes essential for capturing hotspots of respiration of organic material on the shelf, which result in corrosive and low oxygen conditions in the Northern CCS (Siedlecki et al., 2015. On the other hand, global models may carry additional nutrients (e.g., iron) and resolve additional functional groups (e.g., diazotrophs) that are not essential for capturing the dominant modes of biogeochemical and ecosystem variability in a region of interest.
Ideally, the global and regional models would use biogeochemical and ecosystem formulations of identical complexity to guarantee compatibility (e.g., Van Oostende et al., 2018), but this approach is often impractical for several reasons: (1) the models used in the downscaled regional domain are generally application-specific and emphasize different biogeochemical processes (e.g., carbon cycling to predict ocean acidification or phytoplankton-zooplankton assemblages to predict forage fish species distributions), (2) significant effort and local knowledge have already been devoted to the complexity, calibration and evaluation of models used in the downscaled regional domain, and (3) global earth system models do not use a common formulation for biogeochemistry, so a regional model could only be matched to one or a subset of global models. In regions where physical and biogeochemical variability is generated predominantly by local forcing, the predicted ecosystem response may be less sensitive to discrepancies in the downscaled variables at the model's open boundaries. In contrast, for a region strongly influenced by remote forcing, a more careful matching of the biogeochemical and ecosystem variables may be necessary (e.g., additional downscaling of phytoplankton and zooplankton functional groups common to the global and regional models or aggregating total phytoplankton biomass and redistributing it into regional functional groups based on local ecosystem properties). Understanding the sensitivity of regional ecosystem responses to these biogeochemical and ecological formulations is an important area for future research.
When the coarser model being used to force a regional model does not include a biogeochemical or ecosystem component, regional downscaling is still possible (e.g., Fennel et al., 2006;Davis et al., 2014;Siedlecki et al., 2015Siedlecki et al., , 2016. However, it is essential to ensure that the large-scale biogeochemical variability, and the predictability it imparts, is appropriately downscaled to the regional simulation. Therefore, some approaches, such as relying on climatological nutrient conditions at the boundaries, are not suitable for applications where nutrient variability outside of the regional domain significantly influences the ecosystem variability inside the domain. One approach to recover large-scale biogeochemical variability from a physical global model is to derive biogeochemical fields (e.g., oxygen, nitrate, DIC, total alkalinity) from physical fields (e.g., temperature, salinity) using local empirical relationships (e.g., Davis et al., 2014;Siedlecki et al., 2015Siedlecki et al., , 2016) that have in some cases already been established from local observations (e.g., Alin et al., 2012;Jacox et al., 2018). This approach ensures that biogeochemical variability associated with large-scale anomalous physical conditions is translated to the downscaled simulation, but it is limited by the stationarity of the empirical relationships being applied. From a predictability standpoint, it remains to be demonstrated whether errors introduced by statistical downscaling of global fields introduces larger uncertainties in the regional solution than those inherently associated with the global model output, especially when considering impacts on broader ecosystem characteristics such as distribution or abundance of higher trophic level species.

Statistical forecasts
Many ecological forecasts follow a 'bottom-up' approach in which physical forecasts are linked to ecological components of interest, most often using empirical statistical models ( Fig. 2; Table 1). Ecological forecasting typically relies on four components: (i) skillful forecasts of physical variables; (ii) predictable responses of an ecological metric of interest to physical change; (iii) relevance to, and engagement with, forecast end-users (Payne et al., 2017;Dietze et al., 2018); and (iv) timeliness of the forecasts to be included within the end-user decisionmaking process and timeline (Zador et al., 2016). In lieu of processbased or mechanistic models of ecological dynamics, for which theoretical knowledge and detailed data required for parameterization (Buckley and Kingsolver, 2012) may be lacking, statistical models (sometimes termed correlative models) offer a pragmatic approach to ecological forecasting. These models can approximate mechanistic forcing through correlations (Dormann et al., 2012), but they should be developed with a priori mechanistic understanding (Fourcade et al., 2018) and must account for both physical forecast error and uncertainty in empirical statistical relationships that can further degrade forecast skill (e.g., Turner et al., 2017; see also Section 4.2.2). Alternatively, some ecological forecasts rely on ecological lags, rather than physical forecasts, to generate predictability. Such forecasts can be based on the life history of individual species; for example, spring/ summer abundance of the toxic dinoflagellate Alexandrium fundyense in the Gulf of Maine is forecast based on counts of resting cysts (their dormant stage) the previous fall/winter (Anderson et al., 2014). In some cases, observed environmental conditions are statistically correlated with some later response in the ecosystem. Ocean temperature measurements can be used to predict the beginning of the annual lobster migration in Maine, with 1-3 months lead (Mills et al., 2017) and silver hake distributions on the Northeast U.S. shelf have been found to lag shifts in the Gulf Stream path by roughly 6 months, a relationship mediated by the covariance of Gulf Stream position and bottom temperature on the shelf (Nye et al., 2011;Davis et al., 2017). In other cases, ecological forecasts are generated by extrapolating from timeseries data without the need for covariates or exogenous variables (Stergiou and Christou, 1996;Stergiou et al., 1997). This type of approach is most commonly used to forecast annual fisheries landings but has broader application to forecasting populations in general (Ward et al., 2014).

Mechanisms of predictability in marine ecosystems
In this section we document mechanisms that impart predictability to physical, biogeochemical, and ecological properties of marine ecosystems. These mechanisms include oceanic processes, atmospheric and coupled ocean-atmosphere phenomena, sea-ice dynamics, biogeochemical and ecological responses to environmental forcing, and life history properties of marine populations (Table 2). Together, these mechanisms have the potential to be leveraged in marine ecosystem forecasts for wide-ranging applications. We focus on predictability associated with these mechanisms on S2I timescales, though in some cases they have broader application (e.g., see Liu and Di Lorenzo (2018) for mechanisms associated with predictability of Pacific decadal variability). Similarly, our discussion uses regionally specific examples (i.e., highlighting the manifestation of mechanisms along North American coasts), but they occur throughout the world's oceans.

Mechanisms of physical predictability
4.1.1. Oceanic processes 4.1.1.1. Persistence. Ocean anomalies, once formed by processes as diverse as surface fluxes and advection, can persist for extended periods in a given location. This persistence of anomalies carries inherent predictability, which is much higher in the ocean than in the atmosphere. For example, given the high specific heat capacity of water, anomalies in temperature and other ocean variables can remain in place for months, enabling skillful forecasts on S2I timescales based on persistence alone. This persistence can be influenced by a wide range of dynamical processes and is seasonally dependent at mid-latitudes due to changes in the mixed layer (ML) depth and consequently its thermal inertia.
Among North American Large Marine Ecosystems (LMEs), those in the Pacific (Eastern Bering Sea, Gulf of Alaska, California Current, Insular Pacific Hawaiian) exhibit higher SSTa forecast skill from persistence than those in the Atlantic (Southeast U.S. Continental Shelf, Northeast Continental U.S. Shelf) ( Fig. 4; Hervieux et al., 2017). For LMEs in the North Pacific, persistence SSTa forecasts provide significant skill for lead times of at least 3 months, regardless of when they are initialized, and for up to 12 months (and likely longer) for certain initialization months (Fig. 4). In the Northeast and Southeast U.S. LMEs, persistence forecast skill for SSTa typically extends only 1-2 months (Fig. 4), though Gulf of Maine SSTa formed in early spring can persist for~6-8 months, much longer than those formed in early summer Chen and Kwon, 2018). In the Mid Atlantic Bight (MAB), the relative contributions of the atmosphere-ocean heat flux and ocean advection and their seasonality influence decorrelation timescales of temperature anomalies, which can vary by a factor of two due to strong interannual variability in both atmospheric and oceanic processes. As a result, the predictability of spring temperature anomalies in the MAB is tractable on relatively short (≤ 2-month) timescales (Chen et al., 2016). On both the east and west coasts, decorrelation timescales and associated persistence forecast skill are depth dependent; depthaveraged temperature or water column heat content in the Gulf of Maine has longer decorrelation timescales, and greater predictability, than shelf temperature in the MAB (Chen et al., 2016). Similarly, subsurface anomalies (e.g., bottom temperature) in the Northern CCS are forecast with greater skill than surface anomalies ; the reasons for this finding are under investigation, but increased persistence at depth likely plays a role.
4.1.1.2. Re-emergence. Seasonal variations in the depth of the surface ML can influence the evolution of ocean temperatures. Outside the tropics, temperature anomalies that form at the surface and spread throughout the deep winter ML remain at depth after the ML shoals in spring. These anomalies are incorporated into the summer seasonal thermocline, between approximately 20 and 100 m depth in the North Pacific, where they are insulated from damping by surface fluxes. When the ML deepens again the following fall, the deep anomalies are reentrained into the surface layer and influence SST. In addition to predictability of winter SSTa based on observed conditions the prior Table 2 Summary of mechanisms underlying seasonal-to-interannual (S2I) marine ecosystem predictability along U.S. coastlines including associated timescales of predictability and the sections of the text in which they are discussed. The timescale refers to marine ecosystem predictability, not necessarily the mechanism itself; for example, tropicalextra-tropical atmospheric teleconnections can be quite fast (i.e., weeks;  but may impart predictability on much longer timescales in the ocean (e.g., 1-12 months; Jacox et al., 2017). We focus on the dominant timescales of predictability in the S2I timeframe, though in some cases these mechanisms are associated with predictability on shorter (e.g., for persistence) or longer (e.g., for advection) timescales as well.

Mechanism
Timescales of predictability Section winter, this process can generate predictability of subsurface temperature anomalies (below the ML) in summer. First noted in SST records by Born (1970, 1974) and later termed the "reemergence mechanism" (Alexander and Deser, 1995), this process has been shown to occur over large portions of extratropical oceans including coastal regions (e.g., Alexander et al., 1999Alexander et al., , 2001Hanawa andSugimoto, 2004, Byju et al., 2018), and a similar process can occur for salinity anomalies .
4.1.1.3. Ocean waves. In addition to the familiar wind-driven surface waves, other types of ocean waves can influence marine ecosystems on much longer time and spatial scales. Some, like Kelvin waves, propagate along boundaries including the equator as well as the coasts of continents. The effects of Kelvin and other "coastally trapped waves" are confined to a narrow near-shore region, on the order of tens of kilometers wide. Others, like Rossby waves (also referred to as planetary waves) can be thousands of kilometers long and move slowly westward across the open ocean outside of the equatorial band. The vertical structure of the ocean, with a rapid change in density in the pycnocline, leads to a prominent class of waves with a "first baroclinic mode structure", which produce anomalies of opposite sign in sea surface height and pycnocline depth. While slower than the surface waves, first baroclinic mode Kelvin waves move relatively rapidly and can propagate along the eastern Pacific margin from the equator to Alaska in~2 months. First baroclinic mode Rossby waves propagate relatively slowly, taking several years to cross the Pacific Ocean from east to west. Predictability associated with these waves is of interest as they alter ocean currents and hydrography, and can have longer term effects on ocean physics and biology even after the waves have moved on. 4.1.1.3.1. Coastal waves. Coastal boundaries support baroclinic Kelvin and other coastal trapped waves that propagate at speeds of 200 km day −1 (e.g., Gill, 1982) with the boundary on the right (left) in the northern (southern) hemisphere, and thus act as a source of predictability in the downstream region. In the Pacific basin, baroclinic mode equatorial Kelvin waves, with SSH and thermocline depth anomalies of opposite sign, are an integral part of ENSO events. Wind variations along the equator generate equatorial trapped Kelvin waves that propagate eastward and excite Kelvin and other coastal trapped ocean waves upon reaching the continental boundary. Along their path of propagation, these coastal waves also are driven by variability in longshore winds, whose influence increases with latitude along the North American west coast (e.g., Enfield and Allen, 1980). Coastal trapped waves have been observed to propagate to high latitudes along the west coasts of North and South America, generating substantial variability in SSH, coastal currents, and thermocline depth (Enfield and Fig. 4. SSTa reforecast skill measured by anomaly correlation coefficient (ACC) as a function of lead time and initialization month for seven Large Marine Ecosystems along North American coasts. Gray dots indicate ACC significantly above 0 at the 95% confidence level. White triangles indicate ACC significantly above persistence at the 90% level, with upward triangles showing ACC > 0.5 and downward triangles showing ACC < 0.5. Forecast SSTa was evaluated against the NOAA Optimum Interpolation SST version 2 (Reynolds et al., 2007;Banzon et al., 2016). Adapted from Hervieux et al. (2017) Allen, 1980Chelton and Davis, 1982;Clarke and van Gorder, 1994;Frischknecht et al., 2015;Jacox et al., 2015a), and potentially affecting lower trophic level production (Clarke and Dottori, 2008). While the propagation of coastal trapped waves is a clear mechanism for predictability of downstream conditions, the trapping scale of these waves is typically on the order of tens of kilometers, so they are subgrid scale phenomena in current global climate models. High-resolution regional ocean models have the potential to better harness predictability associated with coastal trapped waves (e,g., Kurapov et al., 2017), though proper modeling of coastal trapped wave propagation must account for frictional decay as well as wave scattering at topographic features.
Similarly, coastal trapped waves could carry anomalous signals along the Northwestern Atlantic coast from the Labrador Sea to the Northeast U.S. shelf. This coastal (or topographic) waveguide has often been suggested as a major equatorward pathway for high-latitude signals associated with the deep limb of the Atlantic meridional overturning circulation (Kawase, 1987;Yang, 1999;Johnson and Marshall, 2004). However, the viability of coastal trapped wave propagation as a mechanism of predictability along the North American east coast is much less clear than it is along the west coast. The topographic waveguides on the east coast are primarily aligned with the continental slope, well offshore in comparison to the west coast, and it is not yet clear how these propagating signals impact the broader shelf environment. In addition, there are some dynamical barriers to coastal trapped waves along the east coast, including the Northwest Corner near the eastern edge of Grand Banks where the Gulf Stream/North Atlantic Current impinges upon the continental slope (Rossby, 1996) and the Laurentian Channel where a significant freshwater plume exits the Gulf of St. Laurence (Richaud et al., 2016). Previous studies have shown that equatorward propagation of anomalies from the subpolar gyre along the Northwest Atlantic shelf and upper slope is predominantly driven by advection (Section 4.1.1.4; Chapman and Beardsley, 1989;Rossby and Benway, 2000;Peña-Molino and Joyce, 2008;Shearman and Lentz, 2010;Xu et al., 2015), which is a much slower process than coastal trapped wave propagation.
4.1.1.3.2. Baroclinic Rossby waves. In the extratropics, first baroclinic Rossby waves, readily apparent in SSH measurements from satellites (Fig. 5), propagate westward at speeds of~2-8 cm s −1 depending on latitude (Chelton and Schlax, 1996). Rossby waves can form near the coast from refraction of coastal waves and subsequently propagate westward across the North Pacific (Jacobs et al., 1994;Meyers et al., 1996;Clarke and Dottori, 2008). However, open-ocean wind forcing, rather than eastern boundary waves, appears to be the dominant driver of Rossby wave formation in the North Pacific (e.g., Miller et al., 1997;Capotondi and Alexander, 2001;Fu and Qiu, 2002;Andres et al., 2011). Once formed, the relatively slowly propagating Rossby waves provide a mechanism for SSH predictability within the ocean basins and along western boundaries. Rossby wave propagation is readily apparent as diagonal lines in Hovmöller (time-longitude) plots of SSH (Fig. 5). In the central Pacific around Hawaii, sea level variability is affected by westward-propagating planetary Rossby waves (Firing et al., 2004) as well as by regional wind-stress anomalies via dynamical (e.g., Ekman pumping) and thermodynamical (e.g., evaporative cooling) oceanic processes (Long et al., 2020). While global forecast systems are not able to resolve the latter fine-scale drivers of coastal sea level variability, on seasonal timescales they are able to skillfully predict large-scale SSH anomalies associated with Rossby wave propagation in regions including the tropical Pacific (Widlansky et al., 2017) and around Hawaii (Fig. 5).
The generation and structure of Rossby waves in the North Atlantic appear to be more complex than in the Pacific (Osychny and Cornillon, 2004). Nevertheless, Rossby wave propagation has been shown to generate predictability for sea-level anomalies in Bermuda (Sturges and Hong, 1995) and along the U.S. east coast (Hong et al., 2000) and, to a lesser degree, SSTa in some portions of the basin (Zhang and Wu, 2010). On interannual and longer timescales, wind-generated Rossby waves can influence Gulf Stream transport (e.g., Sturges and Hong, 2001) and, on reaching the western boundary, generate fast boundary waves that modulate the annual cycle of sea level along the east coast (Calafat et al., 2018). While Gulf Stream transport is difficult to predict on sub-annual timescales, it has important implications for the predictability of sea level in east coast regions prone to nuisance flooding (Sweet et al., 2014;Ezer, 2016).

Advection.
Changes in along-shore currents can lead to anomalous coastal conditions and may provide a basis for physical predictability. For example, during 2002, anomalously cold, fresh conditions along much of the North American west coast, characteristic of subarctic waters, were attributed at least in part to a strengthening of the equatorward California Current (Barth, 2002;Bograd and Lynn, 2003;Freeland et al., 2003;Murphree et al., 2003). When temperature anomalies are balanced by salinity anomalies so that there are no density perturbations (i.e., spiciness anomalies), heat content anomalies can be advected over long distances. For example, subsurface ocean temperature and salinity anomalies can propagate slowly eastward along isopycnals, advected by the mean North Pacific currents, and influence ocean conditions along the west coast of North America years later (Taguchi and Schneider, 2014;Pozo Buil and Di Lorenzo, 2017;Taguchi et al., 2017).
Similarly, the poleward flowing California Undercurrent (CUC) transports relatively warm and salty Pacific Equatorial Water as far north as the Aleutian Islands, affecting the properties of North American west coast shelf and slope waters along its path (Reed and Halpern, 1976;Hickey, 1979;Bograd et al., 2008;Thomson and Krassovski, 2010;Connolly et al., 2014;Nam et al., 2015;Turi et al., 2016;Durski et al., 2017). Changes in the depth of the CUC appear to be especially important, more so than changes in the composition of the CUC water, for influencing shelf waters through seasonal upwelling (Meinvielle and Johnson, 2013). Therefore, predicting undercurrent variability has the potential to contribute to predictability of both surface and subsurface conditions in the CCS. While the CUC is too weak and not properly resolved in global climate models (Hickey et al., 2016), its presence is promising in terms of mechanistic pathways for predictability of ocean conditions within the CCS. For example, the CUC in the Climate Forecast System Reanalysis (CFSR) strengthens and deepens in summer to early fall and periodically surfaces along the entire CCS, consistent with observations in the region (Thomson and Krassovski, 2010;Pierce et al., 2000;Durski et al., 2017). While global forecast systems have yet to be evaluated with respect to the CUC, accurate prediction of variability in its strength and position would likely impart predictive skill for oceanographic conditions along the west coast, and this skill may be enhanced through dynamical downscaling with regional ocean models that can better resolve the CUC. Fig. 6. SSTa forecast skill in the CCS as a function of initialization month and lead time for (a) persistence forecasts, (b) the CanCM4 global forecast system, (c) a forecast that uses CanCM4 for ENSOneutral years and persistence for years following moderate-to-strong ENSO events (1983,1987,1988,1989,1992,1998,1999,2000,2003,2008), and (d) a forecast that uses CanCM4 for years following moderate-to-strong ENSO events and persistence for all other years. White dots indicate significant skill above persistence (95% confidence level). Anomaly Correlation Coefficient (ACC) was calculated for 1982-2009 using NOAA's 0.25°O ISSTv2 as truth.
In the Northwest Atlantic, large-scale ocean circulation such as the Gulf Stream has been linked to ocean temperature on the Northeast U.S. continental shelf, offering promise for ecosystem predictability (Nye et al., 2011). The position of the Gulf Stream, in turn, has been found to respond to NAO extrema with~1-year lag (Joyce et al., 2000;Frankignoul et al., 2001). Therefore, the dynamical and statistical link between the large-scale ocean circulation and the Northeast U.S. shelf temperature provides a robust source of predictability for the coastal environment on the Northeast U.S. shelf. In addition to the Gulf Stream transport, along-shelf advection from the Labrador Sea has been shown to have major impacts on downstream temperatures (Chapman and Beardsley, 1989;Rossby and Benway, 2000;Shearman and Lentz, 2010;Xu et al., 2015). Peña-Molino and Joyce (2008) showed it takes one year for SSTa to propagate equatorward from Nova Scotia to Cape Hatteras along the continental slope. In the Gulf of Maine, subsurface temperature anomalies have been shown to lag the NAO by two years (Mountain, 2012), while SSTa can be traced upstream to the Labrador Shelf four years earlier (Xu et al., 2015). Additional evidence of the impact of alongshore geostrophic advection on shelf temperature was presented by Forsyth et al. (2015), who found a significant correlation between coastal sea level and depth-averaged MAB shelf temperatures two years later, though predictive skill associated with this correlation needs to be explored further. Further influences of alongshore currents, for example Gulf Stream warm-core rings that can induce significant shelf-slope exchange (e.g., Joyce et al., 1992;Chen et al., 2014;Zhang and Gawarkiewicz, 2015), influence the hydrography of the shelf, alter critical habitats, and even introduce species from other regions (e.g., Hare and Cowen, 1996), are beginning to be explored (e.g., Monim, 2017) but are not yet associated with known predictability.

Tropical-extratopical connections
While the atmosphere itself tends to be less predictable than the ocean on S2I timescales (e.g., Davis, 1976), atmospheric forcing (e.g., surface winds and heat fluxes) can give rise to ocean predictability, often associated with large-scale modes of climate variability. The dominant interannual climate signal impacting the Northeast Pacific region is ENSO, and one of the ways ENSO impacts the North American west coast is through an atmospheric teleconnection in which tropical convection excites atmospheric Rossby waves that modify the strength and position of the Aleutian Low, jet stream, and storm track (Trenberth et al., 1998). Consequent impacts along the coast during El Niño include poleward (downwelling-favorable) wind stress anomalies, warm SSTa, and increased precipitation and river runoff along the California coast (e.g., Park and Leovy, 2004;Jacox et al., 2015a). ENSO variability has been shown to be the primary source of seasonal predictability for air temperature and precipitation anomalies over North America (Quan et al., 2006;Peng et al., 2012) and for wind and precipitation anomalies over Hawaii and U.S.-affiliated islands (Annamalai et al., 2014), and the same may be expected for oceanic anomalies that respond to the same teleconnections. Indeed, along the west coast, SSTa forecast skill above persistence derives primarily from predictable ENSO-related anomalies (Fig. 6). Furthermore, a latitudinal gradient in SSTa forecast skill along the west coast, with higher skill in the north, indicates that ENSO-related forecast skill in global forecast systems comes primarily through the atmospheric teleconnection rather than the oceanic teleconnection (Section 4.1.1.3.1; Jacox et al., 2017). The Northern CCS response to ENSO variability is dominated by changes in the wind, while the oceanic teleconnection (i.e., coastal trapped wave propagation) dominates the Southern CCS response to ENSO (Hermann et al., 2009;Frischknecht et al., 2015). Since the former is resolved in relatively coarse resolution global models while the latter is not, the predictable ENSO-driven variability in the north is better captured . These results from global climate models are consistent with those from statistical approaches, namely the LIM (Section 3.1.3), which showed a large fraction of North Pacific SSTa predictability originating from ENSO, likely through the atmospheric teleconnection given the coarse-grained data used in the LIM . The degree of influence that any particular ENSO event exerts in the Northeast Pacific is likely to be affected by the spatial pattern and evolution of the event (Capotondi et al., 2019b) as well as atmospheric internal variability (Deser et al., 2017).
The ENSO teleconnection also influences coastal waters in the Northwest Atlantic, although the impact is not as strong as for the west coast Scott, 2002, 2008). ENSO exhibits a weak negative correlation with the North Atlantic Oscillation (NAO) especially in winter (Li and Lau, 2012), which results in a basinwide tripole SSTa pattern primarily through turbulent surface heat flux anomalies Hu et al., 2013). SSTa associated with ENSO exhibit particularly strong amplitude along the northern flank of the Gulf Stream, Northeast U.S. shelf, and the Gulf of Mexico coast (Kwon et al., 2010), and thus act as a potential source for predictability for the coastal ocean along the east coast. Similarly, changes in the North Pacific can potentially be used as a predictor for the Northeast U.S. shelf, as the two regions are linked via the Pacific-North America atmospheric teleconnection pattern (McKinnon et al., 2016;Dai et al., 2017). In particular, the PDO as well as Central/North Pacific spring SST, precipitation, and outgoing longwave radiation anomalies lead Gulf of Maine SSTa by 1-3 months (Chen and Kwon, 2018). A similar lagged response of Long Island Sound air temperature and water temperature to North Pacific anomalies was found to be consistent with atmospheric Rossby wave trains emanating from the Western Equatorial Pacific (Schulte et al., 2018). It is not yet clear whether the Pacific SST anomalies are at least partly driving the atmospheric teleconnections or both the Pacific and Atlantic SST anomalies are responding to a common teleconnection with a time lag. Surface and bottom water properties on the Northeast U.S. shelf have also been associated with the NAO 2-4 years earlier (Mountain, 2012;Xu et al., 2015), though that relationship is mediated by changes in alongshore advection (Section 4.1.1.4) rather than resulting from direct atmospheric forcing.
Though less well studied than the ENSO teleconnections described above, additional tropical/extra-tropical coupling may extend predictability to slightly longer timescales. Boreal winter variability in the North Pacific Oscillation (NPO) and its oceanic expression, the North Pacific Gyre Oscillation (NPGO), can energize sea level pressure and surface temperature anomalies the following winter through a tropical bridge Di Lorenzo and Mantua, 2016). Specifically, weakening of the off-equatorial winds associated with a positive winter NPO pattern can trigger the growth of meridional modes (Vimont et al., 2003;Chiang and Vimont, 2004) with SSTa reaching the tropical Pacific in spring/summer and activating an ENSO feedback with teleconnection back to the extra-tropics the following fall/winter. This connection from extra-tropics to tropics and back again can potentially offer predictability on timescales of 1-2 years (e.g., Joh and Di Lorenzo, 2017;Capotondi et al., 2019b) and is predicted to intensify under climate change (Liguori and Di Lorenzo, 2018).

Sea-ice related processes in subarctic regions
The regional seas of the U.S. subarctic (including the Bering, Chukchi, and Beaufort Seas) are seasonally covered by sea ice whose extent, thickness, and timing of arrival and retreat vary considerably from year to year. In the Eastern Bering Sea, sea ice area and thickness are influenced by advection and local formation and melting, which are governed in turn by surface forcing and regional oceanography (e.g., Cheng et al., 2014), and these interactions potentially give rise to predictability of the ocean through several mechanisms. For example, the pan-Arctic ice area has a "melt-to-freeze" season memory (Blanchard-Wrigglesworth et al., 2011), effectively a reemergence mechanism in which the ice edge in fall returns to where it was in spring (after a retreat in summer) in response to SSTa that have persisted in the region of the spring ice melt (Bushuk et al., 2014). Arctic ice area also has a "freeze-to-melt" season memory, which means simply that anomalous ice area and thickness in fall/winter persist to the following spring, though dynamical forecast skill often exceeds persistence forecast skill for regional Arctic sea-ice extent (Bushuk et al., 2017).
In addition to providing predictability for sea ice itself, persistence of sea ice can cascade into predictability of the marine ecosystem, as ice presence and subsequent melting impacts ocean properties from spring into summer (Stabeno et al., 2012;Brown and Arrigo, 2013). In the Eastern Bering Sea, winter sea ice exerts control over the extent of the biologically important summer cold pool on interannual timescales (Fig. 7) as well as in the multidecadal trend, which has shown decreased/thinning sea ice and reduced cold pool extent since the early 1980s (Mueter and Litzow, 2008). While the U.S. east coast does not experience sea ice locally, it receives cold and fresh waters of subarctic origin via advection through an extensive interconnected coastal boundary current system (Section 4.1.1.4). Sea-ice variability in the Labrador Sea, which exhibits considerable forecast skill for lead times up to at least 11 months (Bushuk et al., 2017), could have predictable downstream impacts on the Northeast U.S. shelf.

Biogeochemical response to physical forcing
The mechanisms of physical predictability described in Section 4.1 can also lead to predictability of ocean biogeochemical tracers and ecosystem variables. For example, equatorward wind anomalies along the North American west coast, which respond predictably to ENSO variability (Section 4.1.2), have a twofold impact on the nutrient flux into the upper ocean of the CCS region (Jacox et al., 2015b). In a canonical El Niño event, weaker equatorward winds drive weaker upwelling and also draw waters from shallower ocean depths, both of which reduce upward nutrient flux as well as increasing pH along the coast. The opposite is true during La Niña, when anomalously strong equatorward winds drive strong upwelling, increasing the nutrient supply to the euphotic zone and reducing pH in near-surface coastal waters (Jacox et al., 2015b;Turi et al., 2018). Durski et al. (2017) found that in addition to wind-driven upwelling, alongshore currents and local productivity on the shelf also drive interannual oxygen variability in the CCS, though the predictability of these latter two drivers is yet to be quantified. Off the Oregon/Washington coast, seasonal forecast skill has been demonstrated for biogeochemical properties including subsurface oxygen, pH, and aragonite saturation state . While the mechanisms underlying this skill on interannual timescales are still being investigated, seasonal oxygen declines are driven jointly by local nutrient trapping and physical transport processes (Siedlecki et al., 2015) including advection in the California Undercurrent, which has relatively low pH and dissolved oxygen content and relatively high inorganic carbon and nutrient concentrations (Hickey, 1979;MacFadyen et al., 2008;Pierce et al., 2000).
On longer (multiannual) timescales, predictability of biogeochemical variables in the Northeast Pacific has been attributed to advection of anomalies in the North Pacific gyre and ocean memory (Chikamoto et al., 2015;Sasaki et al., 2010;Taguchi and Schneider, 2014;Di Lorenzo, 2015, 2017) as well as "double integration" effectsintegration of the atmospheric forcing by ocean physics, which are then integrated by ocean biogeochemistry (Ito and Deutsch, 2010;Kilpatrick et al., 2011;Di Lorenzo and Ohman, 2013).

Species responses to environmental change
Marine species behaviors, movements, growth, and mortality are driven by extrinsic and intrinsic mechanistic forcing, and for many ecological metrics applicable to forecasting (Table 1), these mechanistic links are approximated by statistical models. For example, water temperature is an important covariate in ecological forecasting, and reflects physiological functioning and limits (e.g. thermal tolerance and biological rates for metabolism, growth, digestion, performance, and activity). In the Eastern Bering Sea, the extent and timing of winter sea ice  (2008) year from output of the Bering10K model hindcast. Bottom waters with temperature < 2°C form the cold pool. (b) Ice cover anomaly and bottom temperature anomaly indices over the Eastern Bering Sea shelf (the negative of bottom temperature anomaly is plotted to enable comparison with ice cover anomly). The ice cover index is the average ice concentration in the region 56-58°N, 163-165°W for Jan 1-May 31. Ice concentration data are obtained from the National Snow and Ice Data Center (NSIDC) using the bootstrap algorithm for historical data (through~2010) and the NASA Team algorithm for more recent data. Mean annual bottom temperatures were calculated from NOAA/NMFS bottom trawl surveys of the Eastern Bering Sea, conducted June-August and sampling~250-300 stations in a region spanning approximately 54.5-62°N and 179-158°W. Both the ice cover and bottom temperature anomalies were standardized by removing their mean and dividing by their standard deviation (c) Cold pool index from observations, hindcasts, and nine-month lead forecasts from the Bering10K model. Predictions of the cold pool are for July 1 (the middle of the summer bottom trawl survey). In 2018 the forecast failed, largely due to unprecedented southeasterly winds that prevented sea ice formation. Subsequent skill evaluation has suggested skillful prediction is limited to shorter lead times (~3 months) when forecasts are initialized in fall, while longer lead (~6 month) forecasts have skill when initialized in spring.
control spring/summer water temperature ( Fig. 7; Section 4.1.3), which in turn alters the timing of the phytoplankton spring bloom, crustacean zooplankton availability, and the condition factor, metabolic rates, demand for prey, and survival of young walleye pollock. Anomalously warm water and associated reduced spring phytoplankton biomass lead to lower abundance of large crustacean zooplankton. Juvenile pollock consumption rates are increased in higher temperatures despite the reduction of available zooplankton prey, leading to fish that are relatively long but in poor condition ( Fig. 8; Moss et al., 2009;Sigler et al., 2014;Duffy-Anderson et al., 2017), and lowering survival the following winter.
While the use of a mechanistically sound set of predictor variables may reduce the problem of overfitting to the dependence structure of the response, methods to evaluate transferability will remain essential for model selection and skill assessment of empirical models, particularly when predicting outside the range of observed conditions (Wenger and Olden, 2012;Roberts et al., 2017;Yates et al., 2018). In complex ecological systems, where biological processes are driven by interspecies relationships and a range of environmental drivers whose relative importance shifts over time, the most successful empirical ecological forecasts may be the ones of biological processes linked to a direct physical driver (e.g., temperature and growth) rather than one whose effect is mediated by trophic relationships and other processes (Cuddington et al., 2013;Payne et al., 2017). Thus, a mechanistic link between environmental forcings and biological responses will be key to forecast reliability, and first principle approaches may help in this respect. Ecological models for prediction may also have to be more simplistic than those describing historical change, as some physical variables included in an explanatory model will not be forecast skillfully and should be excluded from a predictive model. For example, a species distribution model for dolphinfish, Coryphaena hippurus, used four physical covariates to describe historical patterns (Brodie et al., 2015) but only one physical covariate for a seasonal forecast (Brodie et al., 2017). Similarly, the spatiotemporal scales of environmental variability associated with biological responses must be consistent with the scales of predictability in the environmental variability itself. For example, decomposing SST variability into a seasonal climatology, a low-frequency (~interannual) component, and a high-frequency component (sub-annual) shows that skill in a swordfish distribution model  is most strongly associated with the climatological component, followed by the low-frequency and high-frequency components (Fig. 9). One would expect some distribution predictability based on the climatology and skillful forecasts of interannual SST variability , but distributional changes associated with high frequency (e.g., eddy) variability are unlikely to be predictable on S2I timescales.

Species' life history
Just as the forecast horizon for physical ocean variables is longer than atmospheric ones due to the intrinsically longer timescale of ocean processes (Goddard et al., 2001), the lead time of ecological forecasts can be enhanced by the relatively slow turnover of animal populations. Deriving predictive skill from the influence of observed populations on future year classes requires that the species life history be well studied, which is true for many fish species. Most of the variability in survival of fish populations occurs during early life stages, before fish are recruited to the fishery (Walters and Martell, 2004). Therefore, outside of this early life history, knowing the number of fish in one age class can inform how many fish of the next age class to expect the following year. Indeed, stock assessment models use observations of abundance/biomass to track cohorts over time and make predictions of biomass one to three years in the future. Such predictions are like persistence forecasts, as the "observed" biomass in the current year (actually a model estimate obtained from assimilating catch and survey data) is used to predict biomass the next year. In such forecasts, processes such as recruitment, growth, and mortality that may contribute to gains or losses in biomass are generally assumed constant at recent levels rather than mechanistically forecasted based on environmental conditions. However, multi-species statistical catch-at-age models allow for natural mortality rates to change annually , as is done in the climate-informed multi-species stock assessment for walleye pollock, Pacific cod, and arrowtooth flounder in the Eastern Bering Sea (Holsman et al., 2017). To the extent that these models include temperature, prey quality, and other environmental factors, the persistence observed will be passed on to the population dynamics of the stocks. Alternatively, a range of plausible productivity (recruitment, growth, and mortality) values sampled from historical variability can be used to produce an ensemble of future biomass forecasts. The predictability horizon of such fisheries forecasts is dependent on a species' life expectancy and population demography, with longer lead times for species that have lower natural and fishing mortality ( Fig. 10; Brander, 2003). Starting from an observed abundance, natural and fishing mortality rates can be used to estimate the point in time when a population will be comprised by equal parts of fish that were observed at forecast initialization and subsequent recruits (Brander, 2003). After this point, skill is less dependent on persistence of observed age classes and more on the validity of model assumptions concerning future productivity.
For fast-growing, short-lived species, particularly those in singlecohort fisheries such as squid (Agnew et al., 2002), little to no predictive skill can be derived from persistence of the population from year to year. In these cases, incorporation of biogeochemical or physical forecasts is key to enhancing predictability. For instance, use of a seasonal SSTa forecast can enhance the recruitment predictability horizon and lead to more effective catch targets for Pacific sardine (Tommasi et al., 2017b). Similarly, recruitment forecasts for yellowfin flounder are more accurate when informed by an environmental covariate (Miller et al., 2016). For semelparous species like salmon, which are recruited to the fishery as mature spawning adults and die after spawning, recruitment forecasts are essential to assess the strength of the incoming adult year-class. Ocean conditions impact salmon survival during their first few months at sea, and that survival is later reflected in adult populations when they return to spawn. Thus, observed ocean conditions can be used to forecast recruitment to the fishery based on salmon life history with lead times up to 2-3 years (Peterson et al., 2014). In many cases environmentally informed salmon abundance forecasts exhibit increased skill relative to the sibling models commonly used in management (which rely on estimated freshwater returns to predict ocean abundance or freshwater returns at a later date) (Winship et al., 2015). However the reliability and optimal construction of environmentally informed forecasts vary over time (Winship et al., 2015) and their skill is likely a function of the ocean state; both bottom-up (food availability) and top down (predation) impacts on salmon survival may be more prominent when ocean conditions are poor (Wells

Better resolution of key processes in general circulation models
The last two decades have witnessed remarkable advances in the fidelity of ocean model simulations due to (i) steadily increasing computational resources that have enabled enhanced model resolution, (ii) development and implementation of sophisticated parameterizations of unresolved ocean processes (e.g., vertical mixing), and (iii) algorithmic improvements to crucial numerical implementations (e.g., tracer advection schemes). Nonetheless, additional improvements in physical models are required to better represent key processes in the future. In coastal waters, increases in resolution are particularly important as length scales of physical variability (e.g., the first baroclinic Rossby radius) decrease with ocean depth, approaching zero near the coast. For example, small-scale changes in atmospheric forcing (e.g., due to coastal orography) and variability in physical boundaries such as coastlines and bathymetry (e.g., capes, bays, canyons, and banks)  (Neveu et al., 2016). Also shown are three temporal components of the SST, calculated from 1980 to 2016 daily SST output. For each 0.1°grid cell, a monthly climatology is calculated and then removed from the daily SST time series to produce anomalies. The anomalies are then smoothed with a 12-month running mean to produce the low frequency (interannual) component, with the remaining signal making up the high frequency (subannual) component, such that Full = Climatology + Low Frequency + High Frequency. (b) Predictive performance of swordfish distribution models built using Boosted Regression Trees  with different temporal components of the SST as predictors. Model predictive performance is measured using the area under the receiver operating characteristic curve (AUC) statistic. Fig. 10. Contribution of biological persistence to prediction skill for Pacific sardine, Albacore tuna, and shortfin mako shark, as measured by the percentage of observed abundance remaining in the population in later years. All species are assumed to experience a fishing mortality rate of 0.25 year −1 . Natural mortality is 0.6 for sardine (Hill et al., 2017), 0.4 for albacore (ISC, 2017), and 0.128 for mako shark (ISC, 2018). meaningfully influence the coastal ocean circulation. Biogeochemical properties are likely to exhibit strong gradients on even smaller spatial owing to their fast and sometimes nonlinear responses to the physics that drive them (e.g., Mahadevan and Campbell, 1926). In particular, upwelling that is a fundamental driver of coastal ecosystems exhibits considerable alongshore heterogeneity in both amplitude and depth of source waters, and consequently nutrient transport and primary production. Tidal exchanges with estuaries, combined with estuarine mixing, complicate the physical and biogeochemical signatures of river outflows that unavoidably influence local coastal ecosystems. Internal waves generated offshore by the barotropic tide are directed onshore in heterogeneous beams that may reflect at the continental slope and shelf (depending on the local topographic slope and seasonally varying stratification) or continue onshore, ultimately breaking and contributing to local vertical mixing and nutrient transport. Surface windwaves influence local circulation and ecosystem elements in at least two ways, through alteration of near-surface circulation due to wave-induced surface stress and Stokes' Drift of near-surface organisms. Finally, we require a much better understanding of the impacts of near-surface submesoscale (1 km or less; hours to days) processes including intense local vertical velocities, local changes in vertical mixing, frontogenesis, frontolysis, and nonlinear Ekman transport. These processes are increasingly modeled, yet their ecosystem impacts are still unclear, and predictability associated with them remains poorly understood.

Reforecast skill assessments
Skill assessments, in which long-term reforecasts (i.e., retrospective forecasts of past conditions) are tested against atmosphere and ocean datasets or reanalyses (i.e., data assimilative historical model runs), provide the means to investigate (i) the spatial and temporal predictability of particular variables like SST, SSH, and precipitation (e.g., Annamalai et al., 2014;Stock et al., 2015;Hervieux et al., 2017;Widlansky et al., 2017), (ii) the benefits of regional downscaling , and (iii) mechanisms underlying forecast skill . Metrics used to assess marine forecast skill in the literature largely mirror those from the meteorological community, and can be deterministic (e.g., anomaly correlation coefficient, root mean square difference) or probabilistic (e.g., Brier score). To this point, marine forecasts have most commonly been assessed for their ability to predict monthly average anomalies of SST and SSH, metrics that are influenced by large-scale climate variability, influence marine species or coastal communities, and have been adequately measured to enable forecast verification over multiple decades. Moving forward, more attention should be paid to predictability of additional environmental variables (Section 2) that influence marine species in different ways, as well as suites of variables that may act as co-stressors for particular organisms.
Verification of rapidly proliferating marine forecasts will require extension of skill assessments not only to additional physical and biogeochemical variables, but also to integrated quantities deemed important as ecosystem indicators (e.g., undersaturation days, summer hypoxic volume) and to higher trophic level characteristics of interest to end-users. Similarly, evaluation of ocean hindcasts (historical model runs forced by observed boundary conditions typically from ocean/atmosphere reanalyses but with no ocean data assimilation) to determine models' ability to simulate processes of interest is a key part of building effective forecast systems (Tommasi et al., 2017a;Siedlecki et al., 2016;Hobday et al., 2016). While comprehensive skill assessments likely require decades of reforecasts initialized at regular intervals, no standard protocol or threshold is currently in place as to the number of seasons, years, or cycles of a climate oscillation that must be included in skill assessments in order to provide confidence in the forecasts going forward. Such protocols are likely dependent on details of a given forecasting application, including the specific region and variable(s), but effort should be made to establish protocols for reforecast experiments to ensure they are adequate for building confidence in forecasts.

Forecast uncertainty and ensembles
Sources of forecast uncertainty include initial condition errors as well as uncertainty due to model errors. The distinction between these two sources of forecast uncertainty is useful pedagogically, but they are not independent in a real physical system like the atmosphere or ocean. Fig. 11. Dependence of CCS SSTa forecast skill on forecast ensemble size. (top) Skill (Anomaly Correlation Coefficient, ACC) as a function of initialization month and lead time for persistence forecasts, individual CanCM4 runs, and CanCM4 ensembles of 5 and 10 members. (bottom) Skill as a function of lead time for persistence, the NMME ensemble mean, and CanCM4 ensemble sizes of 1-10 members. All possible combinations of ensemble members were tested for each ensemble size. As CanCM4 has 10 ensemble members, there are 10 possible one-member ensembles (i.e., individual runs), 45 possible two-member ensembles, 120 possible three-members ensembles, and so on up to one possible 10-member ensemble. For each ensemble size, the skill of each possible combination is shown as a thin line and the mean skill of all possible combinations is shown as a thick line. In all cases, ACC was calculated for 1982-2009 using NOAA's 0.25°OISSTv2 as truth.
Owing to both model and observational limitations, initial conditions for any forecast of a chaotic system can only be estimated within a certain accuracy. Decades ago, Lorenz (1982) quantified forecast error growth as a function of initial condition errors by studying the rate at which solutions of a numerical weather prediction model diverged. Now, there is growing interest in quantifying forecast uncertainty for many areas of the earth system and on multiple timescales. Methods for uncertainty quantification need to cope with sensitivities to initial conditions, interactions of many spatial and temporal scales in the earth system, and the fact that the sources of uncertainty are themselves uncertain.
Moving from physical forecasts to biogeochemical and ecological forecasts, additional uncertainty is introduced. An ecosystem model, in general, will exhibit variability associated with a response to physical forcing plus intrinsic variability associated with nonlinearities in the ecology itself. Therefore, even if a physical model can skillfully predict the environmental changes in the system and an ecological model can properly represent the complexity of the ecosystem response, errors may propagate nonlinearly through the food web in ways that limit our ability to predict the ecosystem response. For example, errors in physically driven nutrient fluxes will lead to errors in phytoplankton growth, which then cascade to errors in zooplankton growth and so on up the food chain to fish and higher-order predators. If the error growth is linear, predictions of target species may not be adversely affected. But, if the error growth is exponential, even small errors in physical forcing could lead to saturated error statistics in the mid-to-high trophic levels. This type of error propagation through trophic levels has not been quantified in typical ecosystem models currently being used and should be explored and quantified, both in its dependence on parameters and its dependence on the complexity of the model food web dynamics.
One common method of quantifying uncertainty and improving forecast skill in physical systems is to use model ensembles. For deterministic forecasts, which offer a single predicted outcome with no statement of uncertainty (Wilks, 2011), the issued forecast may be the mean or median of the ensemble. In this approach, forecast skill for a multimodel ensemble mean tends to be as good as or better than the skill of the best model in the ensemble, even when the ensemble includes models with relatively poor skill . Similarly, for a single model, the mean forecast computed from multiple ensemble members is more skillful than the forecast from any single model realization (Kumar and Hoerling, 2000). In the example of SSTa forecasts in the CCS,~3-4 ensemble members are needed to reliably beat a persistence forecast on 2-4 month lead times. At longer leads (> 6 months) a single model run can beat the skill of a persistence forecast, but larger ensembles improve skill up to a point, with each additional ensemble member offering diminishing returns (Fig. 11). Ecological model ensembles may consist of models that are fundamentally different in their construction (i.e., using very different statistical techniques), rather than of multiple realizations from models that are structurally very similar (as is typical in the physical realm). Nonetheless, there are indications that the same general finding (i.e., an ensemble mean forecast outperforms the individual models that contribute to it) holds at least for species distribution modeling (Marmion et al., 2009). However, the use of an ensemble mean as a deterministic forecast omits potentially useful information the ensemble can provide about forecast uncertainty (Demargne et al., 2010). In contrast to deterministic forecasts, probabilistic forecasts issue an expected outcome, taken from a set of possible outcomes, by providing a summary measure of the ensemble or assigning a probability distribution to the forecast data (Potts, 2011). Probabilistic forecasts provide explicit statements of uncertainty regarding how well the future state of a system is known (Wilks, 2011). Knowing the probabilities of different outcomes is beneficial from a risk management and decision-making perspective, and use of probabilistic information is commonplace with respect to weather forecasts (e.g., deciding whether to carry an umbrella based on forecast likelihood of rain). Examples of probabilistic forecasts in marine resource management are less common, though probabilistic SST forecasts are used to predict the likelihood of coral bleaching  as well as El Niño events (Kousky and Higgins, 2007), which in turn have been used to inform fishery closures to avoid bycatch (Welch et al., 2019b). In addition, retaining the distribution of potential outcomes will be important for forecasting extreme events, with large forecast ensembles (e.g., 50-100 members) likely necessary to forecast conditions in the tails of distributions (Doi et al., 2019), particularly if the ocean extreme is not driven by a predictable deterministic forcing mechanism .

Computing resources
S2I prediction requires significant computing infrastructure, in terms of both processing power and data storage. The climate models run at these time scales include more components and complexity than weather forecast models (Mariotti et al., 2018), and large suites of reforecasts are needed to evaluate model skill and quantify biases. For example, the typical contribution of an individual modeling center to Phase I of the NMME included 10 ensemble members each run for 30 years of reforecasts, initialized monthly and extending to lead times of 12 months, totaling 3600 years of model simulations. Output from these runs must then be stored, and while 2D physical fields such as SST and SSH have most commonly been analyzed, developing marine ecosystem forecasts necessitates saving 3D ocean fields including temperature, salinity, currents, and biogeochemistry (when available) at temporal resolutions not less than monthly and ideally much higher. When high-resolution regional ocean models are needed to properly represent nearshore dynamics critical to marine ecosystem forecasts, an added layer of computational demand is incurred. The global forecast must be run first, with an additional suite of surface atmospheric state variables and fluxes saved at daily or finer resolution and used to force the regional model. Operationally, this results in a longer forecast production time, additional data storage for non-standard global variables as well as regional model output, and demands on data sharing as global and regional models are typically not run on the same architecture or by the same people.
Beyond serving as a restriction to operational forecasting efforts, these computational challenges present a difficulty to the researchers outside of operational centers and federal laboratories. Running relevant experiments in the S2I prediction architecture and performing data transfer and analysis on large datasets often requires computational resources inaccessible to researchers not affiliated with large institutions. Funds from research grants are often diverted to purchase local storage and analysis systems, or researchers compete for increasingly oversubscribed federal high-performance computing resources. In this respect, the marine forecasting community benefits immensely from efforts to make forecast outputs publicly available, both by individual researchers and centers and through collaborative projects like the NMME, LC-LRFMME, and SubX.

Observational limitations
Predictive skill is often reduced by the limitations of the prediction system, which are, in turn, related to limitations of the observations that are key in the development, initialization, and verification of models used for prediction (Capotondi et al., 2019a). While satellite sensors provide regular global coverage for certain surface ocean properties (e.g., SST and SSH), co-located surface and subsurface measurements are important for producing consistent atmospheric and oceanic initial conditions, verifying coupled feedbacks, and, in the case of coupled ocean-atmosphere reanalyses, constraining the coupled error covariance matrix. However, such measurements are sparse. Argo floats have considerably extended our ability to sample the subsurface ocean, but it is challenging for Argo and drifting buoys to provide sustained measurements in strong currents or near the equator. Likewise, seasonal events such as hurricanes or sea-ice formation and melt can prohibit sustained deployment of in situ observing platforms. Biogeochemical data available for model initialization are even more sparse, often leading to long spin-up times for biogeochemical variables (Wunsch and Heimbach, 2008), and a possible mismatch between biogeochemistry and physics (Seferian et al., 2016). The limited spatiotemporal coverage of biogeochemical data, combined with the variable timing and often ephemeral nature of biogeochemical anomalies, are limiting factors in model verification, as short observational records may not be representative of the model climatology.
The long-term stability of the observing system and the availability of long and homogeneous records are critical for error minimization and verification of ocean and atmosphere reanalyses (Balmaseda et al., 2015, Xue et al., 2017. Similarly, long and homogeneous records are critical for the development and verification of statistical prediction systems, whose performance relies on the robustness of statistical relationships between predictands and predictors. On regional scales, forecasting activities rely heavily on the availability of observations that target the spatial and temporal scales of interest, and come from a diverse set of observing assets including moorings and floats, autonomous underwater vehicles, research cruises, and satellite sensors (e.g., Siedlecki et al., 2016;Ortiz et al., 2016). Preservation and extension of physical, biogeochemical, and ecological data records (i.e., through sustained or expanded support for the platforms listed above and others) are fundamental requirements for developing, initializing, and validating marine ecosystem forecasts. Additionally, efforts to expand the suite of observations available in marine forecasting would benefit from incorporation of cross-disciplinary technology (e.g., animal-borne telemetry tags) and opportunistic research platforms (e.g., fishing and transportation vessels).

Forecast user engagement
Improving predictability and developing skillful forecasts of marine ecosystems can be highly useful to a variety of end users, including managers and industry representatives. However, the utility of forecasts depends on the forecast skill in time and space, the needs of the enduser (e.g., spatial or temporal resolution and lead time necessary for management decisions), and effective dissemination of information (Hobday et al., 2016). While there are many examples of predictive models developed with management or industry applications in mind, few have developed readily-available forecasts or nowcasts or have been integrated into decision making processes for businesses or managers (though see examples in Howell et al., 2008, Hazen et al., 2017, Hobday et al., 2016. Studies that have successfully applied forecasts to inform decision-making processes highlight the importance of communication between scientists and managers or industry stakeholders, and emphasize that forecast delivery is critical to the utility and implementation of forecasts (Hobday et al., 2016, Dunstan et al., 2018. User engagement and feedback is necessary to refine the format and visualization of the forecast product in order to meet user needs and allow for widespread use and integration of the product into decision-making processes. The transfer of forecasts to end users often takes the form of automated web products providing daily, weekly or monthly forecasts posted, allowing wide dissemination to managers and stakeholders.

Conclusions and summary recommendations
The ability to anticipate changing environmental and ecological conditions provides opportunities to mitigate, adapt to, and plan for their impacts, a great benefit to end-users including marine resource managers, industry representatives, and the general public (Clark et al., 2001). S2I forecasts enable proactive decision making on timescales that are well aligned with marine ecosystem management (Hobday et al., 2018) and constitute a key component of a "climate-ready" fisheries management strategy that leverages information on multiple timescales (Pinsky and Mantua, 2014). The need for skillful forecasts has motivated a growing body of research on marine ecosystem predictability and numerous predictive models have been developed with management or industry applications in mind. To facilitate the development and uptake of marine ecosystem forecasts, we provide summary recommendations for future work in four related areas: model development, skill and uncertainty assessment, forecasting infrastructure, and communication with end-users.
Model development • Improve representation of key processes via model resolution and/ or complexity as appropriate, and quantify predictability associated with these improvements.
• Quantify the sensitivity of regional ecosystem responses to physical and biogeochemical formulations at model boundaries, including methods for matching biogeochemical fields between global and regional domains and investigating the degree to which signals that are poorly resolved in global models (e.g., coastal trapped waves, poleward undercurrents, buoyancy-driven coastal currents) can be transmitted through model boundaries into the regional domain where they are better resolved.
• Explore empirical models (e.g., LIM) for both physical and ecological forecasts • Better integrate numerical and statistical methods (e.g., through hybrid statistical/dynamical forecasts and statistical pre/post-processing of dynamical forecasts) to maximize predictive skill • Develop statistical models for higher trophic levels based on a priori mechanistic understanding or hypotheses.
Skill and uncertainty assessment • Quantify predictability and forecast skill for a broader range of surface and subsurface variables as well as suites of variables that may be co-stressors for living marine resources or coastal communities.
• Quantify predictability and forecast skill for ecological forecasts themselves, building on physical forecast skill assessment and historical assessments of ecological models.
• Quantify the dependence of predictability and forecast skill on spatiotemporal scale, and the degree to which spatiotemporal scales of physical and ecological predictability are consistent with each other.
• Establish protocols (e.g., temporal extent, skill metrics) for reforecast evaluations to ensure they are adequate for building confidence in forecasts.
• Quantify forecast uncertainty, including error propagation from physics to biogeochemistry and higher trophic levels.
• Assess skill not only for properties of interest, but also for the mechanisms that drive variability in those properties.

Forecasting Infrastructure
• Make forecast fields readily available at spatiotemporal resolutions needed for analysis and application to marine resource management, and (in the case of global forecasts) to force downscaled regional forecasts.
• Preserve and extend physical, biogeochemical, and ecological data records needed to develop, initialize, and validate marine ecosystem forecasts.

Communication
• Establish and maintain communication between scientists and endusers early in the forecast development process, to ensure that forecasts align with the needs of the end-user, and in the implementation and delivery phase, to refine the forecast product and facilitate uptake in decision-making processes.
• Develop methods to bring uncertainty in the mechanisms of predictability into the communication of forecast uncertainty (e.g., in communication of conditional forecast skill).
In this paper we have provided a review of existing forecasting approaches, described environmental and ecological dynamics that have been (or could be) exploited in predictability studies and forecast development, and recommended priorities for future work. While our focus is on coastal ecosystems surrounding North America, and the U.S. in particular, the content (including forecasting approaches, physical and biological mechanisms, and priority developments) apply globally. Our aim is to support and motivate continued advances in the field of marine ecosystem forecasting. To that end, we suggest that if marine forecasts are to effectively support ocean decision making, particular attention should be paid to the mechanisms underlying their skill. This mechanistic understanding of marine ecosystem predictability will allow the research community to target specific areas for forecast skill improvement, quantify and communicate forecast uncertainties, ensure predictive relationships hold up over time, and provide a basis for enduser confidence in marine forecast products.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.