Bayesian network modelling provides spatial and temporal understanding of ecosystem dynamics within shallow shelf seas

Understanding ecosystem dynamics within shallow shelf seas is of great importance to support marine spatial management of natural populations and activities such as fishing and offshore renewable energy production to combat climate change. Given the possibility of future changes, a baseline is needed to predict ecosystems responses to such changes. This study uses Bayesian techniques to find the data-driven estimates of interactions among a set of physical and biological variables and a human pressure within the last 30 years in a well-studied shallow sea (North Sea, UK) with four contrasting regions and their associated ecosystems. A hidden variable is incorporated to model functional ecosystem change, where the underlying interactions dramatically change, following natural or anthropogenic disturbance. Data-driven estimates of interactions were identified, highlighting physical (e.g. bottom temperature, potential energy anomaly) and biological variables (e.g. sandeel larvae, net primary production) to be strong indicators of ecosystem change. There was consistency in the physical and biological variables, identified as good indicators in three of the regions, however the shallower region (with depths < 50 m, that is targeted for static offshore wind developments) was the most dissimilar. The use of contrasting regions provided useful insights on responses linked to ecosystem disturbances and identified the top predators as better indicators for each region, with the harbour porpoise being a particularly valuable indicator of ecosystem change across most regions. Another important finding was the dramatic changes in the strength of many interactions over time. This suggests that physical and biological indicators should only be used with additional temporal information, as changes in strength led to the identification of two potentially significant periods of ecosystem change (after 2005 and after 2010), linked to physical pressures (e.g. cold-water anomalies, seen in bottom temperatures; salinity changes, seen in the potential energy anomaly) and primary production changes. The hidden variable also modelled a change in the early 2000s for all the regions and identified maximum chlorophyll-a and sea surface temperature as some of the better indicators of these ecosystem changes.


Introduction
There is about to be an abrupt step-change in the use of our shallow shelf seas around the globe, specifically by the addition of large-scale offshore renewable energy developments to combat climate change. The extent of these developments may end up using > 30% of shallow shelf seas (Scottish Sectoral Marine Plan for Offshore Wind Energy, 2020; Outer Continental Shelf Renewable Energy Leases, BOEM USA, 2021; Available at 14th FYP Development Plan for Renewable Energy, 2020; the 14th FYP Development Plan for Renewable Energy, China, 2020). Given that, many trade-offs will need to be weighed up rapidly for the future sustainable management of marine ecosystems between different uses of our shallow shelfs seas, e.g. renewables, fisheries, marine protected areas. However, to proceed with more certainty, we need a much greater understanding of how different marine ecosystems, and specifically their multiplicity of physical and biological interactions, are likely to change across different locations in space and over time with both climate and anthropogenic transformations. Given the complexity of marine ecosystems, full understanding might be challenging, so we need to have pragmatic methods and advances. We need to make the best use of knowledge about interactions and mechanisms from shallow shelf sea ecosystems to provide an effective baseline to predict their responses to natural versus anthropogenic changes, providing rapid strategic advice for more sustainable future spatial use of our seas.
Shallow shelf seas are a vital part of the marine environment. Despite covering only about 8% of the Earth's ocean surface area (Harris et al., 2014), shelf seas support 15-20% of global primary productivity (de Haas et al., 2002) and >90% of the fish we eat (Simpson and Sharples, 2012). Biophysical properties (e.g. temperature, primary production) are vulnerable to and altered by climatic changes and human activities, which can strongly affect coastal and shelf sea ecosystems (Burden et al., 2020). The North Sea is very well studied and is a good example of shallow seas around the world with contrasting large regions of shallower depths (<50 m), that will be targeted by static offshore wind developments, deeper areas (>50 m) that can be utilised by floating wind, regions with high tidal resources and also regions of high wave resource (The Crown Estate Offshore Wind Leasing Round 4; Scottish Sectoral Marine Plan for Offshore Wind Energy, 2020). The North Sea and specifically its proximity to multiple countries makes it a good example of a shallow sea that is highly susceptible to anthropogenic pressures (Capuzzo et al., 2018) and has suffered rapid warming with temperatures increasing by up to 0.24 • C per decade, with pronounced warming north of Scotland and in the North Sea . Much is known about the physical and anthropogenic drivers of change to the North Sea as it receives inputs from large river systems, which, in combination with warming, have consequences for the amount and types of plankton species within primary production driving the base of the ecosystem food chain (Chust et al., 2014). River run-off also affects coastal stratification, which is a key control for shelf sea marine ecosystems as it is one of the main determinants in the spatial distribution of coastal habitat types (Cox et al., 2018). The timing and strength of stratification, which is mainly determined by annual/seasonal weather patterns, tidal forcing and depth (Simpson and Sharples, 2012), is important for primary and secondary production, can influence the distribution of marine animals and has been found to be implicated in the breeding success of seabirds (Platt et al., 2003;Scott et al., 2006;Carroll et al., 2015;Cox et al., 2018). Extensive changes in the characteristics of the plankton including production, biodiversity and species distribution, have been shown to have impacts on fisheries production and seabird populations, mainly driven by ocean warming and natural climate variability, such as the Atlantic Multidecadal Oscillation (AMO) and North Atlantic Oscillation (NAO) (Edwards et al., 2020). In fact, the effects of fishing may have been exacerbated by climate warming and climate-induced changes in primary production, leading to impacts on demersal fish and seabirds (Lynam et al., 2017). The placement of largescale offshore renewable developments to combat climate change will have ecosystem-level effects on marine habitats (Boon et al., 2018;De Dominicis et al., 2018;Sadykova et al., 2020). For example, it has been found that both climate change and very large-scale tidal energy extraction can act in the same direction, in terms of increasing stratification, however the effect of climate change is an order of magnitude higher (by as early as 2050) and the tidal extraction effects are extremely location specific (De Dominicis et al., 2018).
All of these drivers and effects are interrelated within the shallow sea ecosystem and although, significant progress has been made in developing models that use traditional statistical approaches to understand the relationships between a number of variables (e.g. temperature, primary production; Lynam et al., 2017), including dynamic ecosystem models (Spence et al., 2018), all of these models assume that the underlying relationships are in a steady state. This assumption might not be true, as ecosystems are known to sometimes undergo relatively fast structural changes that have a major effect on the ecosystem dynamics (Möllmann et al., 2008). Further, it is possible that the changes are driven by unobserved components, i.e. ecosystem variables that we do not have data on. Thus, it is recommended that ecosystem models develop richer non-mechanistic appreciation of ecological interactions across space and over time due to changing pressures at different levels of the trophic chain (Uusitalo et al., 2018). Therefore, to proceed in a pragmatic manner to better understand ecosystem changes from both climate change and anthropogenic pressures, we adopt a relatively novel predictive and non-mechanistic technique: Bayesian networks, capable of inferring functional network structures, capturing nonlinear, dynamic and arbitrary combinatorial dependency relationships (Heckerman et al., 1995). Few assumptions are made about the data and complex, spatially varying interactions can be recovered from collected field data, as demonstrated by Trifonova et al., 2015. We used a hidden variable to enable the modelling of non-stationary dynamics (Tucker and Liu, 2004). This is potentially highly useful in ecological analyses where complex ecological interactions change in time due to changing pressures at different levels of the trophic chain.

Study aims
This study used a machine learning optimization technique to find the data-driven estimates of interactions among a set of physical and biological variables, that included the critically important factors for ecosystem functioning, discussed above (e.g. stratification, primary production, temperature) and examined their changes across contrasting spatial regions and over 30 years within UK waters. The 1st aim of the study was to compare the physical and biological interactions and their changes within four different regions (approximately of 3500 km 2 size each) with contrasting habitat types (based on the habitat's physical and biological characteristics, Fig. 1) and over the last 30 years to identify the best indicators of abundance or biomass change and which species or functional groups are more reflective of such change. Using this knowledge about the best indicators, the 2nd aim was to determine similarity and differences between the regions, i.e. ecosystems. The 3rd aim of the study set out to identify the timing of any potential functional ecosystem changes (where the underlying interactions dramatically change), by examining the temporal trends of the interactions. The contrast in the regions will allow their characterization in terms of their spatial and temporal ecosystem dynamics and provide an understanding of their potential response to ecosystem changes. We used a dynamic autoregressive hidden Markov model (ARHMM), i.e. one that explicitly represents the behaviour of the system over time, that incorporates a hidden variable to enable the modelling of non-stationary dynamics (Tucker and Liu, 2004). A hidden variable can be used to detect a change in the interactions of the observed variables over time. Its value depends on all the observed variables it is linked to, and a change in the pattern of the hidden variable indicates a change in the system interactions. Therefore, as part of the 4th aim, we wanted to see whether the hidden variable can also be used to model functional changes in the ecosystem dynamics. This would enable us to further identify the timing of such changes and identify the best indicators of such changes and further characterize any similarities or differences in the spatial and temporal ecosystem dynamics of the contrasting regions and their associated ecosystems.

Study regions
The four contrasting regions are: the Shetland/Orkney region, the west of Scotland region, and the deep and shallow central North Sea regions (Fig. 1). The choice of regions was based on their physical and biological characteristics; we wanted to compare a range of different habitats (shelf edge, Atlantic influenced, deep and shallow coastal seas), that are also key with respect to future offshore renewable energy developments. The division of the central North Sea was performed based on bathymetry, the deep region is characterized by a depth, > 50 m, where floating wind developments may proliferate, whilst the shallow region, with depths < 50 m, encapsulates the proposed locations of one of the world's largest development of static wind farms (the Dogger Bank wind farm 1 ). The Shetland/Orkney region is a key location for tidal energy extraction, whilst the west of Scotland is targeted for wave energy development (Scottish Sectoral Marine Plan for Wave and Tidal Energy, 2020).

Ecosystem components
The data consisted of annual values (1990-2019) with seasonal mean values (summer: July, August, September and October) on all physical (temperature and stratification, except for current speeds), biophysical (primary production) and biological (abundance, biomass or breeding success) variables for each spatial region. We refer to all the variables in the study as "ecosystem components" but distinguish components based on them being either physical (e.g. horizontal currents speed) or biological (e.g. sandeel larvae) indicators (Table 1).
The ecosystem components in the study were chosen as they cover the main physical and biological variables that have been shown to be important to marine mammals and seabirds and their prey (Carroll et al., 2015;Wakefield et al., 2017;Chavez-Rosales et al., 2019). These will change with climate change and, also with the next biggest change to our shallow seas: very large extraction of energy from offshore renewable developments (Wakelin et al., 2015;Holt et al., 2016;van der Molen et al., 2014;Sadykova et al., 2017;Boon et al., 2018;De Dominicis et al., 2018). The seabird and marine mammal species were chosen to provide contrasts in their foraging and breeding behaviours, and for the high level of spatial and temporal data availability. The human pressure (i.e. fisheries catch) was chosen due to its importance in controlling the marine food webs within the North Sea (Mackinson et al., 2009;Lynam et al., 2017). For more details on the ecosystem components, please refer to the Supporting Information (SI).

Bayesian networks
Formally, a Bayesian network (BN) describes the joint distribution (a way of assigning probabilities to every possible outcome over a set of variables, X 1 …X N ) by exploiting conditional independence relationships, represented by a directed acyclic graph (DAG) (Friedman et al., 1999). The conditional probability distribution (CPD) associated with each variable X encodes the probability of observing its values given the values of its parents and can be described by a continuous or a discrete distribution. In this case, the CPD is called a Conditional Probability Table (CPT) and all the CPTs in a BN together provide an efficient factorization of the joint probability: where pa i are the parents of the node x i (which denotes both node and variable).
The DAG consists of nodes (or variables) and edges (or links) between the nodes. "Parent" nodes are those from which arrows originate and "child" nodes are those to which arrows are pointing. Edges between nodes represent dependency relationships. Each node in the DAG is characterized by a state which can change depending on the state of other nodes and information about those states propagated through the DAG. By using this kind of inference, one can change the state or introduce new data or evidence (change a state or confront the DAG with new data) into the network, apply inference and inspect the posterior distribution (which represents the distributions of the variables given in the observed evidence). The graphical structure of BNs is particularly convenient when we aim to describe an ecological network to model all the interactions between species and their environment that also provides a user-friendly framework to communicate the results (Chen and Pollino, 2012). It is relevant to think of the BN as a "graph", describing species as the "nodes" within the graph, and interactions as the links or "edges" that join the nodes (Faisal et al., 2010).

Dynamic Bayesian networks and autoregressive hidden Markov model
Modelling time series is achieved by using an extension of the BN known as the Dynamic Bayesian Network (DBN), where nodes represent variables at time slices. DBNs are directed graphical models of stochastic processes that characterize the unobserved and observed state in terms of state variables, which can have complex interdependencies (Murphy, 2001). DBNs generalize hidden Markov models (HMMs) which model the dynamics of a dataset using a hidden variable. This hidden variable is used to model unobserved variables and missing data and can infer some underlying state of the series when applied through an autoregressive link (ARHMM, Fig. 2) that can capture relationships of a higher order (Murphy, 2001). The hidden variable allows us to examine unmeasured effects that would bring further insight on the importance of ecosystem dynamics to better understand community structure and resilience in an exploited ecosystem (Trifonova et al., 2015;Uusitalo et al., 2018). In most domains, the observed variables represent only   (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018) some characteristics of a system, which can have a negative effect on the learning procedure. For example, the apparent complexity of a predicted variable can be explained imagining it is a result of two simple processes, the "true" underlying state, which may evolve deterministically, and our measurement of the state, which is often noisy (Murphy, 2002). We can then "explain away" unexpected outliers in the observations, as opposed to strange fluctuations in "reality".

Experiments
The methods in this study consisted of two steps: 1) We learn the hidden variable based on the values of all the observed ecosystem components from the ARHMM (Fig. 2) performed for each region. In this way, its value depends on all of the observed variables it is linked to, and a change in the pattern of the hidden variable indicates a change in the system dynamics. When the model parameters are fitted with data, the value of the hidden variable is set so that it maximizes the fit of the model to the data (e.g. the log-likelihood). If the patterns of the observed variables change in the time series, e.g. the slope of a dependency between two variables changes, the value of the hidden variable linked to these variables changes. Thus, we use the hidden variable in this study, to represent a change in the underlying ecosystem dynamics (i.e. ecosystem state), following a natural or anthropogenic disturbance to the system interactions in the different spatial regions. We want to compute P (H t |X t , X t − 1 ), where H t represents the hidden variable and X t represents all observed variables at times t. We use the predicted variable states from time t to infer the hidden state at time t. The hidden variable was parameterized using the Expectation Maximization (EM) algorithm (Bilmes, 1998). In the first step of the EM, the hidden variable is inferred using the predicted states, whilst in the second step the estimated likelihood function is maximized. When the algorithm converges to a local maximum, the parameters are estimated. We used an exact inference method: the junction tree algorithm (Murphy, 2001). Non-parametric bootstrap (re-sampling with replacement from the training set, (Friedman et al., 1999) was applied 250 times for each region and season to obtain statistical validation in the predictions. We used the learned hidden variable from each ARHMM and placed it in the hill-climb optimization technique to identify the dependency relationships between the hidden variable and the observed ecosystem components to identify potential key indicators of ecosystem change.
2) We learn the Bayesian network structure for each of the spatial regions by applying a hill-climb optimization technique. The hill-climb search begins with an empty network. In each stage of the search, networks in the current neighbourhood are found by applying a single change to a link in the current network such as "add arc" or "delete arc" and choose the one change that improves the score the most. We used the Bayesian Information Criterion (BIC) for scoring candidate networks: where Θ represents the model, D is the data, n is the number of observations (sample size) and k is the number of parameters. log P(Θ) is the prior probability of the network model Θ, log P(Θ|D) is the log likelihood whilst the term k log(n) is a penalty term, which helps to prevent overfitting by biasing towards simpler, less complex models. The learned Bayesian network links represent dependence, these are spatial relationships that are predictive in an informative, not causal aspect (Milns et al., 2010;Trifonova et al., 2015). The method identifies similarity in the temporal trend of the paired variables (i.e. both variables increase, or as one increases, the other decreases over time). We performed the hill-climb with random restart (n = 10), which conducts several hill-climbing runs, perturbing the result of each one as the initial network for the next. Then, we apply the learning for 1300 iterations for each region. The maximum number of "parent" nodes (learned from the hill-climb) was limited to three to avoid over-fitting (Trifonova et al., 2015). We define a confidence threshold -the minimum confidence (estimate of the probability of finding a relationship) for a relationship to be accepted in the learned network structure. We defined interactions of high confidence in time as those in which we have the greatest mean confidence of being in the generated network (threshold ≥ 0.25). We use the confidence value to represent the strength of each dependency relationship between a pair of two variables. The confidence or strength of the identified relationship represents the level of similarity in the temporal trend of the paired variables. In addition, to learn the network structure for each year in the time window, the hill-climbing was conducted on a window of data (size of window = 10). In this way, we would be able to capture any significant interactions over the previous 10 years. Based on the number of identified interactions between the observed indicators and between the observed indicators with the hidden variable, we define "best" indicators, which represent the most confident data-driven estimates of indicators of ecosystem dynamics and their changes across space and time.

Results
We present identified dependency data-driven estimates of interactions that are currently shaping the ecosystem dynamics in space across UK waters and how they are changing with time, identifying the best indicators leading to such changes. Many of the same interactions were found in all regions, regardless of differences in locations, however at the same time, there were contrasts as the dominance of either physical or biological interactions was clearer in some regions.

Best physical and biological indicators
Overall, the number of interactions between the physical indicators and lower trophic levels was higher, whilst for the higher trophic levels, the majority of interactions were identified with the biological indicators, with just a few physical indicators (Fig. 3). Physical variables that were more often identified as indicators of abundance and/or biomass change across the regions were BT, PEA and Hspeed. Some relationships were identified with the Vspeed, specifically in both regions of the central North Sea. Many more relationships were identified with BT, as compared to SST, in all regions, except for the shallow central North Sea. Overall, the biological variables that were more often identified as indicators of abundance, biomass and/or breeding success included the sandeel larvae, Chla and NetPP. The range of the identified relationships can be seen in Fig. 3 and the ten most confident (confidence ≥ 0.35) relationships for each region are shown in Table 2.
Interactions between the demersal fish group and seabirds (guillemot and/or gannet) were often identified in most of the regions, however the relationship between demersal fish and guillemot was found in all regions, except for the shallow central North Sea (Table 2). Some other relationships that were found to be of high confidence in more than one region included Vspeed-NetPP, Chla-A2 and sandeel larvae-kittiwake.   The harbour porpoise was found to be a consistent component within the majority of the identified relationships from Table 2, except in the deep central North Sea.

Regional differences in the physical, biological, and human pressure interactions
The higher number of physical relationships (specifically with BT and PEA), as compared to the number of biological relationships, was specifically highlighted in Shetland/Orkney and west of Scotland regions ( Fig. 3a and b). However, at the same time, many relationships were found between the zooplankton assemblages and sandeel larvae with seabirds and marine mammals in the Shetland/Orkney and west of Scotland regions. In the deep central North Sea, and interestingly out of all the regions, the highest number of relationships was with Chla and sandeel larvae (Fig. 3c). Similarly, in the shallow central North Sea, the number of sandeel larvae and NetPP relationships was relatively high and also, the number of PEA relationships was the highest there, in comparison to all the regions (Fig. 3d).
The human pressure: fisheries catch, was found to be an indicator of biomass change for both fish groups in Shetland/Orkney and the shallow central North Sea, whilst only for the pelagic fish group in the west of Scotland. Relationships between the fisheries catch and sandeel larvae were found in all the regions.

Temporal trends of the physical and biological relationships within the four regions
We examined the temporal trends of the identified relationships from the four spatial regions. In particular, the focus was on years when the relationships were characterized by a "weakening" trend (confidence of the identified relationships dropped to < 0.5 for a minimum of five consecutive years), to identify the timing of potentially significant functional ecosystem changes (where the underlying interactions dramatically change) and which variables are the best indicators of such changes. The "weakening" trend is specifically defined to identify years of low confidence (strength) between a pair of variables for a single relationship. The low confidence suggests low similarity in the trend of the paired variables or a change in the temporal trend of one or two of the variables/indicators. We distinguish two potentially significant time periods of ecosystem change: after 2005 and after 2010. The potential indicators behind each of those periods and the relationships characterised with a weakening trend during those time periods are shown in Table 3. The indicators behind the first functional ecosystem change more often included the Table 2 The ten most confident (confidence ≥ 0.35) interactions per region. Bold and underlined indicates interactions that were found in more than one region.  The functional time periods for each spatial region and the key indicators associated with each time period, based on the "weakening" trends of the identified dependency relationships. biological variables (Chla, NetPP, sandeel larvae) with some physical and/or human pressure, depending on the region. Interestingly, in the deep central North Sea, there was no clear indication of a second functional time period, as with the rest of the regions. In this region, relationships between the higher trophic levels and the biological indicators (e.g. Chla-harbour porpoise, Fig. 4a) were characterized by a weakening trend from or after 2005, which continued for some of the relationships until the end of the time series, but for some of them the confidence then started to rise again, for example from 2014 (e.g. sandeel larvae-gannet, Fig. 4a). The indicators behind the second functional ecosystem change included the physical variables (BT, SST and PEA), with the addition of the Hspeed, Vspeed and NetPP for the shallow central North Sea and west of Scotland. Some example relationships that were characterized by weakening trends during this time period included A4-harbour porpoise, Hspeed-gannet and PEA-sandeel larvae (Fig. 4b), for all of which the confidence did not increase until the end of the time series.

Hidden variable
To assist in identifying the timing of any potential functional ecosystem changes with the indicators behind such changes, as well as any variables more reflective of such changes, we examine the most confident hidden variable relationships with the observed ecosystem components and the learned hidden variable (i.e. ecosystem state) from  the ARHMMs for each spatial region.
To support the spatial results from 3.1, here the hidden variable identified variables to be potentially important indicators of ecosystem change for the specific region, for example, SST in the shallow and Chla in the deep central North Sea regions (Table 4). Interestingly, in the latter region, the hidden variable identified the most confident relationship with the pelagic fisheries catch, whilst results from 3.1 showed relationships with fisheries catch only with the sandeel larvae. Similarly, the hidden variable for the Shetland/Orkney region identified the sandeel larvae to be key, whilst previous results showed only three relationships with the sandeel larvae (Fig. 3a). In addition, the hidden variable identified some of the higher trophic levels (e.g. guillemot, pelagic fish group) to be key species or groups with respect to ecosystem changes for the relevant region. Also, the trends for some of the relationships between the hidden variable and the observed components supported the timing of the same potentially important functional time periods (e.g. after 2005 and after 2010) for the ecosystem dynamics in the relevant spatial region.
During these time periods, the learned hidden variable state was stable, for example for both central North Sea regions, Fig. 5c, d. However, the hidden variable identified an earlier change in the ecosystem state for the deep central North Sea; it modelled one state until the late 1990s and then changed to another from the early 2000s (Fig. 5c). For the shallow North Sea, the hidden state stabilized after the late 1990s (Fig. 5d). The most different to the other regions were Shetland/Orkney and west of Scotland regions. The hidden variable stabilized until early 2000s and then, it modelled a rather fluctuating state for the west of Scotland region (Fig. 5b), whilst for the Shetland/ Orkney region, the hidden variable modelled a similarly changeable state that fluctuated throughout the whole time series (Fig. 5a).

Discussion
The outcomes of this modelling approach have shown that the last 30 years have contained large changes in how the four regional ecosystems around the North Sea have been functioning, with strong and perhaps some cyclic changes in the strength of relationships between possible drivers of change and the range of indicators. The methods and new findings provide a clear and repeatable way to identify the best indicators of change and deliver insight into the ecosystem processes that are driving shallow sea regions across the globe. These findings indicate that we need to pay much closer attention to how different habitat types of our global shallow seas are managed, as the same human (large-scale renewable developments, changes in fishing, etc.) and climatic pressures will be felt very differently in different regions. The implication of the detail of these findings are provided within the discussion by first setting out the reasoning used and the identification of the best indicators of ecosystem change. The later sections set out what the contrasts in regional and temporal differences provide in terms of better understanding of which relationships are maintained and which have tendencies to vary spatially and temporally. Lastly, given the regional spatial and temporal contrasts, we summarise the knowledge about these relationships to pinpoint the mechanisms that are driving shallow shelf sea ecosystems, such as to be able to improve predictions and implications to ecosystem responses from natural versus anthropogenic changes.

Best physical indicators
Overall, the physical variables that were consistently identified as indicators of abundance and/or biomass changes over time and across the regions included BT, PEA and Hspeed. Temperature is a major driver of marine ecosystems and one of the key factors affecting the physiology and ecology of all marine organisms (Simpson et al., 2011;Edwards et al., 2020;Evans and Waggitt, 2020). Interestingly, BT was found to be a better indicator compared to SST, except in the shallow central North Sea, which suggests that only in locations of shallower depth (<50 m), where BT and SST will be more similar all year round, due to the water column being more mixed, they may be interchangeable. PEA and Hspeed were found to be key indicators for capturing the dynamics of both lower trophic levels (e.g. sandeel larvae and Chla) and top predators (e.g. kittiwake and gannet). The fact that PEA has been shown to be a very important habitat variable in the few other studies that have used it (Carroll et al., 2015;Sadykova et al., 2020), highlights the importance of spatial and seasonal distribution of physical processes as good indicators up through the entire trophic chain. In addition, for the top predators, relationships with Hspeed may indicate that many species prefer regions with stronger currents (Waggitt et al., 2017;Waggitt et al., 2018) with the possible mechanism being due to the presence of shear between water layers having a role in prey capture (Scott et al., 2013;Lieber et al., 2018;Lieber et al., 2019).

Best biological indicators
The biological indicators that were shown to be consistent in capturing abundance, biomass and/or breeding success changes over time and across the regions included sandeel larvae, Chla and NetPP. Similar results were found in other spatial habitat studies, where it was shown that NetPP in particular played a significant role in determining fine-scale distributional preferences (Sadykova et al., 2020) and facilitating foraging (Cox et al., 2018) for both fish prey and top predator species. Interestingly, even at the regional scale of this work, some of the most confident relationships were found between the Vspeed and NetPP (Table 2), clearly picking up the fact that key impacts on mean seasonal change in NetPP are through vertical mixing (Zhao et al., 2019). This further supports the Longhurst Provinces approach (Vichi et al., 2011), that defining the predictability of primary production throughout the trophic chain should be taken into consideration in terms of future spatial management of anthropogenic aspects that can change mixing (Tweddle et al., 2018).
The sandeel larvae were also an important indicator, suggesting the most abundant pelagic prey species in the North Sea is an important factor governing the dynamics of top predators and fish species at the population level scale, as has been suggested for many years (Speirs et al., 2016). Therefore, our results reinforce the wide range of research outcomes that suggest that both spatial and temporal changes in the population dynamics of higher trophic levels are likely to reflect those of their preferred prey which may, in turn, be bottom-up driven by dynamic oceanographic processes (Cox et al., 2018).
Interestingly, Chla was another biological variable, which was consistently identified as an indicator, even more often than the NetPP. The maximum chlorophyll-a concentration (at any depth) can be influenced by a greater variety of weather and physical variables, enhancing the importance of specific processes of bottom-up control of the planktonic food web (Molinero et al., 2013), whereas NetPP values are highly influenced by depth, with shallower, more mixed areas Table 4 The identified interactions between the hidden variable and the observed ecosystem components for each region and a summary of their temporal changes. Bold and underlined indicates relationships that were found in more than one region. generally having higher depth integrated production. This suggests that locations of maximum chlorophyll-a are very ecologically important (Scott et al., 2010;Scales et al., 2014), as they can represent new and/or aggregated primary production, but they may not have high values of biomass and hence, do not show up as areas of high, depth integrated NetPP (Hickman et al., 2012).

Regional summary
Across the four, approximately equal-sized, spatial regions, there were many similarities in the identified relationships, but the differences were clearest in the number of relationships with the physical versus the biological indicators (Fig. 3). The Shetland/Orkney and west of Scotland regions seem to be driven by more physical trends, most likely due to oceanic influences of cold-water anomalies, seen in BTs (Fig. S2a-Fig. S5a) and salinity changes, captured by changes in PEA (Fig. S2c-Fig. S5c) in the Shetland/Orkney region and large-scale wind and salinity effects, similarly captured by PEA but also Hspeed ( Fig. S2d-Fig. S5d) and Vspeed (Fig. S2e-Fig. S5e) in the west of Scotland (González-Pola et al., 2018;Dye et al., 2020;Sharples et al., 2020). Shetland/Orkney was one of the two regions, where fisheries catch was identified as an indicator of biomass change for both fish groups, which could be explained by the fact that this is where the majority of fish biomass is removed as catch (MMO, 2020).
The number of relationships in the deep central North Sea region, even some relationships with physical variables such as BT and Hspeed, was much more dominated by biological indicators (e.g. sandeel larvae, Chla and NetPP), than the deeper, and more oceanic influenced, regions. However, the shallow central North Sea region was found to be the most contrasting, compared to the other three, with a very different "clustering" of physical and biological relationships (Fig. 3d). The hydrodynamic characteristics in the southern North Sea are strongly influenced by large sources of freshwater (i.e. Rhine river), which makes the region subject to large changes in PEA, mainly driven by density-driven stratification (van Leeuwen et al., 2016). The region's shallowness (<50 m) leads to high turbidity, that can significantly affect primary production and its rapid significant warming (>20 • C), and cold-water anomalies, seen in both BTs and SSTs (Fig. S2a, b-Fig. S5a, b), makes the region more temporally variable (Capuzzo et al., 2018). The shallow central North Sea region was the only other region, where fisheries catch was identified as an indicator of biomass change for both fish groups and the only region, where many relationships were characterised by weakening trends (Table 3), for some of which, the confidence equalled zero (Fig. 4b). Therefore, it appears that the ecosystem dynamics in this region might be more prone to variability and might not be as robust to natural and/or anthropogenic changes. The greater number of weakening relationships showing no rebound in strength may indicate that the shallow central North Sea is a region to be viewed as an "early indicator" of profound ecosystem changes, that may later be seen within the northern and deeper regions, given future natural and anthropogenic changes.
Across all regions, many relationships were consistently identified between the lower trophic level indicators and the harbour porpoise. This implies that harbour porpoise (the only marine mobile megafauna species in this study that does not exhibit centrally placed foraging), is a very good indicator species of potential physical and biological changes within the North Sea and much could be gained by closer inspection of the large north to south distribution shift of this species in early 2000s (Evans and Waggitt, 2020). Similarly, many relationships between the seabirds and demersal fish group were consistently found in all the N.I. Trifonova et al. regions (Table 2). This suggests the demersal fish group is a valuable indicator of ecosystem changes. For seabirds, it was the guillemot and gannet that were more often identified as better indicators across regions (Table 2 and Fig. 4), which has important implications with respect to which species should be a focus of study to better understand indirect effects of climate change via changes in food supply for assessments of seabird populations (Mitchell et al., 2020). For a regionspecific indicator, the harbour seal was found to be a good indicator of change for the region of west of Scotland (Tables 2 and 3), which has important breeding populations of the species (SCOS, 2019;Evans and Waggitt, 2020). The results of this study therefore suggest that, rather than picking a species as an indicator because of its protected status, it is best to use those top predator species that have strong and consistent relationships with other trophic levels as indicators. It is also important to take away the study's results that indicate that the relationships between top predator and lower trophic levels may be different in different regional habitat types.

Time periods of functional ecosystem change
A very important finding in this study was the amount of change in the strength of the identified data-driven estimates of interactions over time. For example, we can see from Fig. 4, that interactions are not necessarily characterised with only weakening trends, but rather their trends rise and fall throughout time, which provides us with insights on the ecosystem functioning and stability. Many different types of change are observed in the interactions; they can be rather non-linear, abrupt, and even sometimes reversible with the opposite patterns in the identified relationships (e.g. the rise to very high confidence levels, the extreme fall after 2005 to very low levels and the rise in confidence again over the last 5 years between the sandeel larvae-gannet in Fig. 4a). The concepts of using changes in direction of relationships to define regime shifts (Bakun, 1996;Bakun, 2004), ecosystem state (DeYoung et al., 2004) and feedback processes in marine ecosystems (Bakun and Weeks, 2006) are well known. This work provides the detail of multiple relationships of important physical and biological indicators and their changes over time to begin to pinpoint the mechanisms more effectively behind ecosystem changes.

After 2005
The first time period of functional ecosystem change was identified as after 2005, with most of the main indicators of change found to be biological variables: Chla, NetPP and sandeel larvae across all regions, with the addition of fisheries catch (in Shetland/Orkney and west of Scotland regions, Table 3). The few changes in dependencies with physical variables that were seen were BT (in Shetland/Orkney and shallow central regions) and PEA (in shallow central region), (Table 3). The combination of indicators suggests that the processes behind the first functional change can be linked to stratification and the effects on primary production, impacting the planktonic food-web via changes in hydroclimatic mechanisms. Specifically, large-scale climate variability (as identified by using the NAO) during the period 1997-2004 (Pitois et al., 2012), with cold-water anomalies in 2005, seen in the BTs (Fig. S2a-Fig. S5a; González-Pola et al., 2018) and increases in lower salinity in 2005, seen in PEA (Fig. S2c-Fig. S5c), as a result of a reduction in the exchange between the North Sea and the North Atlantic and the spreading of lower salinity water westward from the Norwegian coast (Holt et al., 2018;Sharples et al., 2020). The NAO can be associated with localized variability in particular processes, and has also been shown to be correlated to changes in sea level, wind and waves (Tsimplis et al., 2005;Wolf and Woolf, 2006), as well as precipitation, thus having a different impact in different areas (Moberg and Jones, 2005). Evidence that such hydroclimatic variations have led to effects on food availability farther up the trophic chain has been seen in effects on the populations of clupeids, sandeels and gadoids (Pitois et al., 2012;Molinero et al., 2013).

After 2010
The second time period of functional ecosystem change was recognised after 2010, with most of the indicators, behind this change found to be physical variables: BT, PEA, SST (in Shetland/Orkney and shallow central regions), Hspeed in the shallow central North Sea and west of Scotland regions, whilst Vspeed only in the shallow central region (Table 3). The only biological variable, identified as an indicator behind this change was NetPP, but only in the shallow central North Sea and west of Scotland. These variables imply the potentially higher importance of physical indicators and their changes, specifically temperature (over salinity), behind the second functional ecosystem change. Changes included cold-water temperature anomalies in 2010 (González-Pola et al., 2018) and later in 2013, which were more clearly seen in the BTs, rather than the SSTs (Fig. S2a, b-Fig. S5a, b). In addition, particular years (2015-2018) with very low strength relationships (e.g. NetPP-DEM in Fig. 4a), that were also of relatively low NetPP values for both central North Sea regions ( Fig. S4g and Fig. S5g) and west of Scotland (Fig. S3g), are most likely attributed to changes in mixing, captured by the Hspeed and Vspeed indicators (Fig. S2d, e-Fig. S5d, e) and changes in their interactions ( Table 3). The combination of these results suggests that the mechanism behind the second functional time period was the interplay between the physical indicators (temperature and mixing) and productivity. This important level of ecosystem change has not been that widely recognised yet, as it has only been partially mentioned by a few studies across different disciplines, that have discussed the relationship between temperature and productivity and their associated changes for zooplankton and fish (Trifonova et al., 2017;Capuzzo et al., 2018), with the attributes behind NetPP decline linked to changes in stratification that impact supply of nutrients through mixing (Couespel et al., 2021).

Hidden variable
A hidden variable was used in this study to learn and therefore, represent the ecosystem state, and specifically capture any changes in the ecosystem interactions that lead to changes in state, thus illuminating the possible mechanisms behind such changes. Including the hidden variable during the learning of the relationships with the observed ecosystem components proved useful in terms of adding additional information to the identification and role of relevant indicators with respect to ecosystem dynamics.
The success of using the hidden variable to identify indicator species of key importance to the ecosystem dynamics has previously been shown in the North Sea (Trifonova et al., 2015) but also for other systems, e.g. Baltic Sea (Uusitalo et al., 2018). In this study, the hidden variable reaffirmed the importance of the physical (BT, SST and PEA) and biological (e.g. Chla and sandeel larvae) indicators by consistently identifying relationships with those indicators across most regions, but also identified additional indicators for some regions (e.g. fisheries catch in the deep central North Sea) ( Table 4). The hidden variable also identified higher trophic levels to be key to the ecosystem dynamics but were more region-specific (Table 4). For example, further recognising the importance of the guillemot and gannet in some regions (Fig. 6), adds additional credence to the conclusion that these two bird species are key indicators of the local ecosystem structure and function and should be used as indicators for examining changes felt throughout the trophic chain. The pelagic fish group was identified for the Shetland/Orkney region, further highlighting the potential impact from fisheries catch on the ecosystem dynamics in this region. These results highlight the importance of knowing which habitat types are best represented by which functional groups and species. These results imply that only certain species can be used in certain habitat types as accurate indicators of temporal changes to understand the relevant ecosystem dynamics.
In addition, the hidden variable added additional knowledge to identifying the timing of any functional ecosystem changes by modelling a change in the learned ecosystem state (Fig. 5). The hidden variable picked up a trend that agrees with the timing of the North Sea "regime shift" from the late 1980s to the mid-1990s (Edwards et al., 2002;Beaugrand, 2004b) and the previously identified state of warm temperatures with low salinity from 1997 to 2004 (Pitois et al., 2012). This was more clearly shown for both regions within the central North Sea (Fig. 5c, d). The ecosystem state for the two deeper and oceanic influenced regions was rather more changeable, although the state for the west of Scotland changed from stable to more erratic from 2003, coinciding with the timing of the above-mentioned temperature and salinity changes. Indeed, it is by examining the learned ecosystem state that allows us to conclude whether the environment is in a desirable (predictable) or less desirable state and during which years the state is desirable or not. Thus, the hidden variable, once set up and updated with rather small effort, could potentially be used to check for possible new changes in the underlying ecosystem dynamics, indicative of major changes in the ecosystem, which could then be further investigated (Uusitalo et al., 2018).

Summary
To summarise, based on both the spatial and temporal results, the outcomes of the analysis of contrasting regions provided evidence as to which physical and biological variables are consistent indicators of spatial and temporal changes (e.g. BT, PEA, SST, Chla, sandeel larvae). The region-specific results (Fig. 6) provided further insights into which species and/or groups of species are the best indicators for the main different habitat types (deep, freshwater or oceanic influenced) of shallow seas. Most importantly, our work has shown that the strength of such indicators may also vary over time, thus the need to consider both spatial and temporal variability of the indicators throughout the trophic chain, is critical to insuring that the strongest and most consistent, highly predictable indicators, of ecosystem change are used. This work lays out the foundations to begin to identify the possible mechanisms behind region-specific and temporal ecosystem changes. The contrasts in the strength of interactions within the regions and over time may reveal much needed insights into the details of marine ecosystem structure and function under the stress of both climate change and anthropogenic forces. Two major periods of change were identified by mostly biological indicators in the early 2000s, and by mainly physical indicators post 2010s, along with evidence of fishing pressure as an indicator in all regions (Fig. 6). The hidden variable modelled a relatively stable ecosystem state for some of the regions, however, these regions were found to be more likely affected by major structural changes (central North Sea regions). Other regions (e.g. Shetland/Orkney and west of Scotland) are subject to more physical pressures (e.g. cold-water anomalies and salinity changes) and other factors (fisheries catch in the Shetland/Orkney region), thus a more changeable ecosystem state.

Implications
This work paves the route for rapid evaluation of ecosystem structure and function and provides methodology to determine the past and current ecosystem state. All of this information provides an effective baseline that can be used within marine spatial planning considerations of the relevant implications of future climate change versus anthropogenic impacts. Such information will be useful to guide what habitats/ species are more representative of what disturbances and what management decisions are required to steer towards more ecologically sustainable conditions under the influence of future changes.
For example, stratification (PEA), has been shown to be a very important indicator in this study and is a well understood mechanism for bottom-up control of the base of marine ecosystem dynamics (Sharples et al., 2020). PEA can also be measured accurately with relatively inexpensive instrumentation and is easily modelled now and in the future under climate change by physical oceanographers (De Dominicis et al., 2018). In the shallow sea regions, with depths < 50 m, which are the type of region targeted for static wind farms, the possible extraction of > 100GW of energy (European Commission Communication, 2020), will change the levels of PEA through both the effects of the structures within the water column increasing mixing locally (Schultze et al., 2020), as well as the possible effects of reducing wind mixing, due to removing energy from the wind (Ludewig, 2015;van Berkel et al., 2020). In the Shetland/Orkney region, which has some of the most tidal power available for extraction, the effects of extraction of reasonable amounts of tidal energy (6 GW) has effects on PEA many miles downstream (De Dominicis et al., 2017). However, most importantly, the effects of tidal energy extraction have been shown to be very minor in comparison to the effects of climate change by 2050 in a "business as Fig. 6. Illustration of the best indicators (shown inside each region) and the major time periods of ecosystem change, which they identified (shown by the vertical bars on the arrows, representing the time series in the study: 1990-2019). The harbour porpoise was found to be a consistent indicator throughout space and time, thus shown outside of the highlighted regions.
usual" scenario (De Dominicis et al., 2018). It is with these types of comparisons, and with the added baseline knowledge from this study on how changes in PEA will also affect higher trophic levels in different regions under different climate conditions, that much more informed choices for marine spatial planning decisions should be made.
Indeed, the importance of physical (e.g. temperature), biological (e. g. primary production) and human pressure indicators (e.g. fisheries catch) has also been recognised in other shelf seas around the globe (e.g. Celtic Sea, Hernvann et al., 2020;Irish Sea, Bentley et al., 2019; shelf seas surrounding New Zealand/Aotearoa, Stevens et al., 2021). This illuminates the feasibility of generalising knowledge from our study and translating it across other shallow shelf seas to be able to address the importance of indicators to ultimately strengthen predictions of population changes at wider ecosystem scales. This increase in knowledge about what indicators are best to use and what they are telling us about how the ecosystem is functioning will ultimately lead to more strategic and integrated approaches to both monitoring studies and assessing anthropogenic impacts to enable evaluation of trade-offs and benefits to provide the most sustainable future spatial use of our seas at whole ecosystem scales. Using this knowledge about best indicators, future research will involve the development of "what-if?" scenarios to investigate the specific effects on the ecosystem components, following changes in the best indicators. This will be done in combination with physical models on finer spatial scales, to provide more detailed understanding of anthropogenic impacts on specific habitats and species.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.