Modeling streamflow variability at the regional scale: (1) perceptual model development through signature analysis

Perceptual catchment models are a key first step towards hydrologic synthesis and provide the process foun- dation of predictive models. However, their development is usually limited to the headwater catchment scale where field work can be done. Here we develop a perceptual model approach for a regional scale catchment. We use a systematic “ top-down ” approach based on an expert-driven interpretation of the streamflow responses of nested subcatchments. This approach complements the traditional “ bottom-up ” perceptual model development based on fieldwork observations, which is impractical at the regional scale due to measurement limitations. As an example to illustrate our approach and demonstrate its feasibility, we use the 27,100 km 2 Moselle catchment, with 26 gauged subcatchments. Our perceptual model of the Moselle catchment presents a description of dominant processes that we consider a-posteriori reasonable, but not a-priori obvious: where precipitation, much more than evaporation or groundwater exchange controls the spatial variability of average streamflow, lithology influences the partitioning between baseflow and quickflow, and topography and land use control hydrograph lag times. A striking feature of our analysis is the rather minor role that vegetation and soil appear to play in streamflow spatial variability at the scale of our inquiry. We use our perceptual model to build a distributed yet parsimonious hydrological model in Part 2 of this paper series. Although the perceptual model is specific for the Moselle, we hope that the perceptual model approach used here can be helpful to others working in large catchments in different hydroclimates, where other factors will dominate.


Introduction
Perceptual models in catchment hydrology are qualitative representations of hydrological systems intended to illustrate their main components, behavior and interaction (e.g. Beven, 2012;Gupta et al., 2012). Critically, perceptual models are a key first step towards hydrologic synthesis and provide the process foundation of predictive models (Beven, 2012). While perceptual model development is recognized increasingly for its central importance (e.g. Wagener et al., 2020) and several illustrations of perceptual models have been made based on expert knowledge (e.g. Beven and Chappell, 2021), few methods have been discussed to bring data to the perceptual model development exercise for large watersheds. Perceptual models have been constructed mainly at the hillslope-or headwater catchment scale (<10 km 2 ) (e.g. McGlynn et al., 2002;Wrede et al., 2015), where field measurements can inform process perception and form the basis for the "dialog between experimentalist and modeler" (Seibert and McDonnell, 2002). But how to develop perceptual models beyond the headwater catchment scale, where field-based sampling and measurement are impractical?
The extreme scale dependency of runoff processes means that perceptual models developed at the field scale on hillslopes and headwater catchments are generally not representative at the meso-(100-1,000 km 2 ) or regional scale (>10,000 km 2 ) (Sivapalan, 2003). Meaning, just as headwater catchments are not a linear superposition of soil cores (McDonnell et al., 2021), the meso-or regional scale catchment a is not a linear superposition of headwater catchments. Thus characterizing the dominant processes at the regional scale does not imply an account of the heterogeneity and complexity observed at the hillslope scale (Sivapalan, 2003).
Regional scale perceptual models could have usefulness beyond the development of tailored conceptual model structures. In particular, they could help identifying regularities and "laws" at the regional scale (e.g. Dooge, 1986;Wagener et al., 2007), including regions of hydrological similarity that exert primary controls on hydrological dynamics (e.g. Loritz et al., 2018;Ehret et al., 2020), or scaling relationships associated to processes aggregation (e.g. Leopold and Maddock, 1953;Blöschl and Sivapalan, 1995). Such insights could contribute regional knowledge that can be exploited in catchment classification studies (e.g. Gnann et al., 2021), or inform "regularization" relationships and constraints for improving model realism and parsimony (e.g. Pokhrel et al., 2008;Gharari et al., 2014). Despite their potential scope, however, the development of regional scale perceptual models has seldom been transparent and lacks clear guidance. Dedicated studies that illustrate such development are few (e.g. Winter, 2001;Savenije, 2010;Gnann et al., 2021). Certainly, expert knowledge plays a role. This is a function of personal and subjective experience (Beven and Chappell, 2021), but also linked to this is intuition, theoretical background and transfer of knowledge from other catchments.
Here we add and explore the interpretation of in-situ data and how it can combine with other forms of expert knowledge to inform perceptual model development at the regional scale. While the traditional perceptual model development based on local fieldwork observations can be regarded as "bottom-up" , our approach based on the interpretation of catchment responses is more "top-down". In particular, we base our process understanding on the interpretation on the spatial variability of streamflow signatures within a nested catchment. We acknowledge that the interpretation of the spatial variability of hydrological signatures has been the objective of many studies focused on catchment classification (e.g. Wagener et al., 2007;Sivakumar et al., 2015). However, these studies have typically adopted a "large sample" approach, which resulted in prioritizing the impact of climate at the expense of landscape (e.g. Beck et al., 2015;Addor et al., 2018), and have relied on general purpose statistical or machine learning approaches (e.g. Peñas et al., 2014;Sivakumar et al., 2015), which were generally difficult to interpret. Our reliance on a nested catchment setup and on expert knowledge is intended to increase process understanding at the regional scale, such as to clarify the interplay between climate and landscape in controlling streamflow responses.
Anticipating which signatures best characterize streamflow variability in any nested catchment, and which dominant processes affect them, is not a priori obvious. First, "uniqueness of place" in terms of catchment properties or measurements limits our ability to create generalizable hypotheses (Beven, 2000). For example, Oudin et al. (2010) showed that "seemingly" similar catchments can respond drastically different, which required digging beyond commonly used metrics of catchment similarity. Secondly, classification and regionalization studies have shown that the signatures that best characterize streamflow variability may differ from one region to the other depending on the underlying dominant processes (e.g. Blöschl et al., 2013;Gupta et al., 2013). Thirdly, even the same signatures can respond to different process controls in different environments. For example, Gnann et al. (2021) showed that the spatial variability of baseflow signatures could be associated with soil and sediment texture in regions covered by glacial deposits in north United States, compared to soil stratigraphy in the Appalachian Mountains in North Carolina, sinkhole density in the Ozarks, and the maturity of the landscape represented by geological age in the Oregon Cascades. This variability of process controls then motivated different perceptual models in these distinct areas. Thus, we suspect that these process controls at the regional scale are equivocal for many other signatures.
Here we demonstrate a systematic approach for perceptual model development at the regional scale, using the 27,100 km 2 Moselle catchment in Europe as a case study. We use 26 stream gauging stations with 26 years of streamflow records, climatic variables, and landscape properties maps. Unlike other systems of its size in Europe, the Moselle has relatively low population density (with 150 inhabitants per square kilometer, www. fgg-rhein.de), an absence of major lakes and exceptionally good data availability. Leveraging these attributes and to show a proof of concept with a "simple" regional scale catchment, our objectives are: 1. To develop a hydrological synthesis for the Moselle catchment in the form of a distributed perceptual model of regional scale streamflow generating processes. 2. To outline the key steps of this process as a proof-of-concept for how to consider regional scale perceptual model development elsewhere using widely available data. 3. To characterize expert knowledge and illustrate how it can be contributed both by experimentalists and modelers in the various stages of perceptual model development.
The paper is organized into six sections. Section 2 presents the study area and the data. Section 3 presents the methodology, including the selection of streamflow signatures, the climatic and landscape indices, and the approaches to identify the influence factors on the selected streamflow signatures. Section 4 analyzes the factors influencing the selected streamflow signatures, and outlines the Moselle perceptual model. Section 5 contains a discussion about the broader implications of the proposed approach, including a synthesis of how expert knowledge was used in the perceptual model development process. Section 6 concludes the paper, providing a synthetic summary of how the objectives of the study have been addressed. As an application example of the developed perceptual model, Fenicia et al. (2022), hereafter referred to as FMM2022, illustrates the development and selection of a distributed, yet parsimonious conceptual model.

The Moselle catchment
The study area is shown in Fig. 1. We focus on the catchment with outlet in Cochem, 50 km upstream of the confluence with the Rhine (drainage distance), with a contributing area of 27,100 km 2 (Fig. 1a). Elevations range between 60 and 1424 m a.s.l., with 96% of the catchment area lying below 600 m (Fig. 1b). Land cover is dominated by forest (38%), followed by crop (30%), pasture (20%), developed areas (8%), defined here as urban, sub-urban, industrial and other artificial landuses, and wetland (4%) (Fig. 1c). In terms of geology, the Moselle catchment extends over four main structural units: two geological basins (the Paris basin in the center-west and the Sarre-Nahe Bergland Permian basin in the east), consisting of mainly sedimentary rocks, and two geological massifs (the Vosges massif in the south, culminating at 1424 m a.s.l., and the Rhenish massif in the north, reaching an altitude of 816 m a.s.l.), consisting mainly of metamorphic rocks. The lithology of the catchment is varied, given the extent of the area (see later Fig. 7a). Soil texture shows contrasting characteristics across the catchment, with coarse material in the mountainous southern part of the catchment, fine materials in the central part, and medium to medium-fine materials in the north (Fig. 1d).
We consider 26 subcatchments based on a selection of stream gauging stations (Fig. 1a). Table 1 reports both the "total" and "incremental" area of the subcatchments, that is, their entire contributing area, and the incremental area from the next upstream subcatchment (hence corresponding to the colored regions in Fig. 1a). The hydrometeorology of the subcatchments is shown in Fig. 2. Fig. 2a shows their hydrographs, whose similarities and differences are analyzed in detail in Section 3.2. Fig. 2b represents the subcatchments in Budyko space (Budyko, 1974), which highlights the variability in the hydrometeorological conditions in the area, with dryness index I D (the ratio between average potential evaporation and average precipitation) ranging between 0.5 and 1.2, and evaporative index I E (the ratio between average actual evaporation, calculated as the difference between average precipitation and streamflow, and average precipitation) varying between 0.3 and 0.7. The subcatchments show significant correlation in the Budyko space, with a Spearman correlation r s = 0.79, and a corresponding p-value p s well below the conventional significance threshold of 5%.

Data sources
The data used in this study are classified into three categories: streamflow, climate (precipitation and temperature), and landscape Table 1 List of nested subcatchments within the Moselle catchment, sorted by their total area, from smallest to largest. The incremental area indicates the area from the next upstream subcatchment (the incremental area is the same as the total area for subcatchments without upstream subcatchments).  F. Fenicia and J.J. McDonnell (maps of elevation, land cover, lithology and soil texture): • Streamflow. Streamflow data Q (mm/h) (normalized by the catchment area) is considered at 26 stations ( Fig. 1a), for the period 01.09.1989 -31.08.2015, where we have assumed the 1st of September to be the start of the hydrological year. The German data are obtained from the German Federal Institute of Hydrology (BfG), the Luxembourgish data are obtained from the "Administration de la gestion de l'eau", the French data are obtained from Eaufrance (downloaded at hydro.eaufrance.fr). All streamflow time series are available at hourly resolution. • Climate. Precipitation P (mm/d) and temperature T C ( • C) data are obtained from the E-OBS gridded dataset (version 14.0, downloaded at www.ecad.eu) (Haylock et al., 2008). This dataset has daily temporal resolution and a regular 0.25 × 0.25 degrees grid spatial resolution (about 20 × 30 km at the catchment's latitude). It is based on an interpolation of ground station data. Note that for precipitation, the daily average of a given day is computed between 6.00 UTC of that day and 6.00 UTC of the following day (Haylock et al., 2008). The same interval is therefore used in computing daily streamflow.
Since the precipitation stations used by the E-OBS data set have much lower density in the French part of the catchment than in the German part, as also apparent in Fig. 1 of Haylock et al. (2008), the precipitation in the subcatchments with the majority of their areas in France (Nietaltdorf, Hauconcourt, Malzeville, Metz, Rosseldange, Toul, Custines, Epinal, Luneville, Nomeny, Boncourt, Wittring) is recalculated using ground station data (obtained from Météo France), interpolated with the Thiessen polygon method. Temperature data are used to derive potential evaporation (E Pot ) time series, using the 1985 Hargreaves equation (Hargreaves and Allen, 2003).
For the calculation of the climatic indices described in Section 3.3, as well as those represented in Fig. 2b, the climatic data concomitant with the availability of streamflow data in each subcatchment is used. • Landscape. Landscape data include maps of elevation, soil, lithology and land cover: • Elevation. The digital elevation model is obtained from USGS HydroSHEDS (resolution 78.4 m). Catchment elevation is shown in Fig. 1b. • Land cover. The maps is generated using the Corine Land Cover European database (clc12, version 18.5, downloaded at land. copernicus.eu). Four land cover classes are considered: crop, forest, pasture, developed areas (urban, industrial, etc.) and wetland (Fig. 1c). • Lithology. The maps are obtained from different sources depending on the country: for France we use the "BD LISA" database (version 1, niveau 2, ordre 1, scale: 1:250,000, downloaded at https://bdlisa.eaufrance.fr), for Germany we use the "Geologische Übersichtskarte der Bundesrepublik Deutschland (GÜK200)" (downloaded at www.bgr.bund.de, scale: 1:200,000), for Luxembourg the lithology map is obtained by the "Administration de la gestion de l'eau" (scale of 1:250,000), for Belgium we use the ihme1500 database (version 11, downloaded at www.bgr.bund. de, scale: 1:1,500,000). The IHME data has a much coarser resolution than the other maps, however only 3% of the catchment belongs to Belgium (see above). The lithology maps are combined, which resulted in a total of 31 lithology classes (shown later in Fig. 7a). • Soil. Soil properties are obtained from the European Soil Database (version 2.0). The distribution of soil texture (attribute TEXT) into fine, medium-fine, medium and coarse attributes is shown in Fig. 1d.
These data are used for the calculation of the metrics defined in the following Section.

Methodology
Our approach to develop a perceptual model characterizing streamflow regional variability consists of the following basic steps: 1. Determine a set of streamflow signatures that characterize key traits of streamflow spatial variability Fig. 3. Key objectives of the perceptual model characterizing the regional variability of subcatchment responses in a regional scale catchment. Relevant multi-site streamflow signatures are related to the spatial variability of climate or landscape, and then interpreted based on dominant processes, hydrological connectivity, and relationships between processes.
2. Specify the methods that are most appropriate interpret the spatial variability of each individual streamflow signature 3. Identify the landscape or climatic attributes that cause the variability of each individual streamflow signature 4. Synthesize the interpretation of each individual streamflow signature in the form of a distributed perceptual model, which elucidates the dominant processes affecting the spatial variability of streamflow The intended characteristics of the perceptual model are schematically represented in Fig. 3. In particular, the perceptual model is expected to include the following information: (1) the definition of a key set of streamflow signatures that characterize streamflow spatial variability at internal subcatchments, (2) the identification of the main landscape or climate controls on the spatial variability of each streamflow signature. Such identification should then lead to an understanding of the climate or landscape variability that matters at that scale, which is used to define regions of hydrological similarity, (3) the characterization of the dominant processes in each region of hydrological similarity, which is derived by interpreting the processes that underlie the variability of individual streamflow signatures, (4) the connectivity between processes associated to distinct regions, hence how distinct landscape sections connect to each other to generate the overall catchment response, and (5) the relationships between spatially variable processes, such as how a certain process (e.g. baseflow, runoff or channel routing) varies in magnitude depending on landscape properties. The various elements of the perceptual model may be associated to varying degrees of confidence, which the development process would help in characterizing.
In this section, we describe the definition of streamflow signatures (Section 3.1), their potential process controls based on previous literature (Section 3.2), and the definition of landscape and climatic indices used in subsequent analyses (Section 3.3).

Selection of streamflow signatures
Our definition of streamflow signatures starts with a visual inspection of the 26 hydrographs designed to single out hydrograph similarities and differences. Fig. 4 exemplifies the key results of this analysis: 1. Streamflow magnitude can vary significantly between subcatchments. For example, Fig. 4a shows the hydrographs at Epinal and Luneville, two neighboring subcatchments in the south of the Moselle catchment. The streamflow at Epinal appears to be systematically larger than the streamflow at Luneville. 2. There are noticeable differences in baseflow versus quickflow proportions between hydrographs. For example, Fig. 4b shows the difference between the hydrographs at Niedaltdorf (on the river Nied), and Reinheim (on the river Biel). The two subcatchments are located in the center east of the Moselle catchment and they are closely spaced (the outlets are 50 km apart). It is noticeable that Niedaltdorf has larger baseflow (and lower peakflow) than Reinheim. 3. The time it takes for hydrograph to rise in response to a rainfall event was found to vary significantly between subcatchments. For example, Niedaltdorf and Nomeny appear to have similar baseflow characteristics, but Niedaltdorf appears to have a flashier response than Nomeny (Fig. 4c). Also these two subcatchments are closely F. Fenicia and J.J. McDonnell spaced (they have a border in common), and the subcatchment with slower response (Nomeny) is the smallest of the two (929 vs 1332 km 2 ). 4. Hydrographs at consecutive gauges on the river network, if the areas of the corresponding subcatchments are not significantly different, can be very similar. For example, Fig. 4d shows the case of Cochem (the catchment outlet) and Trier (50 km upstream in linear distance). Although the two hydrographs are similar, they clearly show an offset, with the downstream hydrograph lagging behind. 5. The hydrographs also present similarities. In particular, they all appear share a similar seasonal behavior, with higher flows during winter, and lower flows during summer, as apparent for example in Fig. 2.
These visual perceptions have motivated the selection of the following streamflow signatures, which are more specifically defined in Appendix A: 1. Streamflow average: Q (mm/yr), which is simply the long term average of streamflow observation 2. Baseflow index: Q BFI (-), which indicates the proportion of baseflow over total flow 3. Time-to-peak: Q TTP (h), designed to reflect the average duration of the rising limbs of the hydrograph 4. Relative lag: Q Lag (h), which quantifies the time shift between two hydrographs 5. Half streamflow date: Q HSD (d), defined as number of days on which half of the yearly streamflow has passed starting from the beginning of the hydrological year (i.e. 1st of September, see Section 2.2), and intended to capture streamflow seasonality For further analyses, it is important to keep streamflow signatures that are relative independent and show sufficient variability. Fig. 5 shows that these conditions apply for the first three signatures, confirming the visual impressions that they reflect distinctive traits of streamflow variability. In particular, the Spearman correlation values r s are relatively low, and the p-values p s are relatively large, indicating limited correlation significance. The relative lag signature is only applicable to pair of subcatchments, and therefore is not directly comparable with the other signatures. The half streamflow date showed very little variability (it varied between 23-Jan to 9-Feb for all subcatchments), which confirmed the visual impression that hydrographs have similar seasonal behaviour, and motivated its exclusion in successive analyses.
It is worth pointing out that the first three signatures listed above broadly correspond to the hydrograph characteristics that the three components of the Kling Gupta efficiency (KGE) aim to optimize (Gupta et al., 2009). In particular, the KGE aims at matching the overall volume of flow, spread of flow, and timing of the hydrograph, which are characteristics that can be associated to the first three signatures listed above.

Fig. 5.
Scatter plot of selected streamflow signatures. The correlation is low, meaning that the signatures are relatively independent. Note also that the signatures present strong variability (e.g. streamflow varies between about 200 and 1000 mm/yr).

Approaches to identify process controls on selected streamflow signatures
After having identified a set of key streamflow signatures, we ask the following two sets of questions: (1) what are their potential process controls? And more specifically, are these controls already a priori obvious? And (2) which methods are most appropriate to identify them? Can available hydrological experience help define tailored and interpretable methods for analyzing individual streamflow signatures, as opposed to general purpose statistical or machine learning approaches?
In order to answer these questions, we synthesize previous studies that have analyzed our selection of streamflow signatures.

Streamflow average
Explaining the spatial variability of streamflow average is part of the more general problem of exploring the geographical variability of the long term water budget (e.g. Budyko, 1974;Dooge, 1992;Milly, 1994). The long term water budget can be expressed as: where the overbar is used to indicate the long term averages and Q, P, E Act and G represent streamflow, precipitation, actual (total) evaporation, and groundwater outflow respectively.
From Eq. (1), it is clear that the spatial variability in Q is due to the combination of P, E Act and G. However, in which proportion these three terms contribute to this variability, and if one of them eventually dominates on the other, is not a priori clear. Moreover, while P can be measured, the other two terms need to be somehow estimated.
The long term evaporation E Act has been related to various climatic and landscape factors. Perhaps the simplest way of representing these dependencies, is by relating E Act to the average potential evaporation E Pot by a simple "stress factor" (e.g. Peng et al., 2019). The more popular Budyko model (Budyko, 1974) relates E Act to E Pot and P. There are several approximately equivalent formulations of the Budyko model (e. g. Daly et al., 2019), including the one parameter Turc-Mezentsev formula (Turc, 1954;Mezentsev, 1955), widely used in many applications (e.g. Bouaziz et al., 2018;Andréassian and Sari, 2019): where b is a model parameter. Such simple formulation has been shown to leave various degrees of unexplained variability depending on the applications, which has been attributed to various unaccounted climate or landscape characteristics in different studies, including the characteristics of vegetation, seasonality, soil properties and topographic controls (e.g. as reviewed in Greve et al., 2016).
Groundwater flow, which is the other main unknown in Eq. (1), can be an important component of the long term water balance. For example, Schaller and Fan (2009) made an extensive investigation on 1555 catchments in the United States, and found that groundwater imports and exports can be significant, primarily driven by climate (with arid basins more affected), basins size (with larger basins being more self-contained), and bedrock permeability (with high bedrock permeability leading to larger groundwater import or exports). In the Meuse catchment, neighbouring the Moselle catchment, Bouaziz et al. (2018) found that several subcatchments are affected by significant groundwater flow, which was attributed to the effect of lithology. The possible significance of groundwater flow has motivated its inclusion even in very simple conceptual models (Le Moine et al., 2007).
Based on the description above, our approach to identify the process controls on the streamflow average signature in the Moselle catchment is as follow: • We will use Eq. (1) as the framework to explain the variability of the regional water budget in the Moselle catchment • Our investigation will start with regressing Q with P, which is measured. We will then include E Act , estimated using simple (e.g. a simple stress factor) to more complex (e.g. Eq. (2)) formulations, and proceed by accounting for G, which could be identified leveraging the nested catchment setup, which implies that the water "lost" at some subcatchments needs to be "gained" at some neighbouring subcatchments (Bouaziz et al., 2018). In line with the top-down approach, there will be no need to investigate further, once a reasonable fit to the observed Q is obtained.
In summary, Eq.
(1) provides the framework to explore the spatial variability of Q, but to what extent it is possible to explain this variability, and in which proportion P, E Act and G contribute to it, is not a priori obvious. We will thus examine available data to establish a plausible explanation for the spatial variability of streamflow average.

Baseflow index
Several studies have indicated bedrock as the most important control on the variability of the baseflow index (BFI) (e.g. Lacey and Grayson, 1998;Longobardi and Villani, 2008;Bloomfield et al., 2009). This finding has an intuitive explanation, as "groundwater is stored in the rocks, especially if they are highly fractured, and this contributes to baseflow" (Lacey and Grayson, 1998). However, different bedrock features can describe the relation of bedrock to baseflow characteristics in different places. For example, in the Alzette catchment in Luxembourg the variability of baseflow index was correlated to lithology types (Pfister et al., 2017), whereas in the Oregon Cascades it was explained by the maturity of the landscape rather than the lithology type (Tague and Grant, 2004).
Besides bedrock properties, a plethora of additional landscape or climate controls was found responsible for the regional variability of baseflow index in different studies (e.g. Price, 2011). Using 8600 catchments in United States, Santhi et al. (2008) found that "relief" (maximum minus minimum elevation in the catchment) was the most strongly correlated variable to BFI, with lower relief leading to higher baseflow; using 3394 catchments worldwide, Beck et al. (2013) found that the BFI was best related to the mean and seasonality of potential evaporation, mean annual air temperature, and mean snow water equivalent depth. They also observed that their results seemed to contradict the perception of a relationship between slope or lithology and BFI. Using 103 catchments in Europe, Schneider et al. (2007) found that soil type could be used to predict baseflow index, although the goodness of fit decreased in the southern areas; using 67 catchment in Belgium, Zomlot et al. (2015) found that baseflow index was mostly affected by vegetation cover; using 12 catchments in Tanzania, Mwakalila et al. (2002) found that baseflow index was positively correlated with the ratio of precipitation on potential evaporation.
Based on the review above, the following approach is adopted to explain the variability of BFI in the Moselle: • We will start from regressing BFI to easily accessible metrics of catchment similarity, including topography, soil and land use. • If results will not be satisfactory, we well sort the variability of lithology classes into a restricted number of relative bedrock permeability classes, based on expert judgment, and regress these properties to baseflow index. This attempt is motivated by the finding that lithology was found to be an important control on streamflow variability in the Alzette mesoscale catchment in Luxembourg, which is part of the Moselle (Hellebrand et al., 2007;Pfister et al., 2017). It is however unclear whether these findings will hold true at the much larger regional scale.
In summary, the variability of BFI depends in principle on multiple F. Fenicia and J.J. McDonnell controls, to the point that it may be even hazardous to rely on the intuition that they can be related to bedrock properties. Our approach will therefore be relatively inclusive in terms of potential controls, and potentially require a careful assessment of bedrock hydrological properties.

Time-to-peak
The time span of the rising limb of the hydrograph has been indicated by various terms, including time-to-peak, time of concentration, lag-time, catchment response time, with various (sometimes overlapping) definitions, some considering the time difference between rainfall and runoff events, and other considering hydrograph properties alone (e.g. Gericke and Smithers, 2014;Beven, 2020). In spite of their different definitions, such timing signatures are strongly correlated to each other, as they share the same intention of characterizing the duration of a hydrograph pulse response (e.g. see Fig. 2 in Gericke and Smithers, 2014). In this section, we use "time-to-peak" to collectively refer to such metrics.
Numerous previous studies have related time-to-peak to various observable catchment properties. The most widely used models used multiplicative relationships between topography related properties, suggesting that the time-to-peak increases with basin size and decreases with basin slope (e.g. Taylor and Schwarz, 1952;Gericke and Smithers, 2014;Beven, 2020). The characteristics of the hypsometric curve, an integrated measure of catchment topography, have also been successfully related to time-to-peak, as an alternative to other topographyrelated measures (Harlin, 1984;Luo and Harlin, 2003). Rodríguez-Iturbe and Valdés (1979) developed a theoretical relationship between time-to-peak and the Horton number, in the attempt to overcome the conceptual limitations of the "so many and not very illuminating regressions we keep using in the field". Such theoretical relation, however, has not found high predictive value, as noted for example by Howard (1990), who found that Horton's numbers "collectively account for only 5 per cent of the explained variance in the estimating equations". His results instead confirmed that terms related to catchment size and average channel gradients provided the greatest predictive power.
Some studies showed that underground flow pathways, and the porous media related to them (soil and lithology), can have a significant impact on time-to-peak. Dunne (1978) showed that hydrographs generated by infiltration excess overland flow had times-to-peak 40 times shorter than hydrographs generated by subsurface stormflow. Gaál et al. (2012) used 396 Austrian catchments and showed that karst catchments as well as catchments dominated by schist (phyllite) had much more delayed responses compared to other catchments with marl and clay lithology. In a study on 3 experimental catchments with different lithology, Wrede et al. (2015) showed that the catchment on schist had a time-topeak in the order days, whereas the other two catchments on clay and sandstone responded near coincidently with the rainfall.
Based on the review above, we will adopt the following approach to investigate time-to-peak variability in the Moselle catchment: • We will look for a multiplicative relationship between landscape attributes. • The relevant dependent variables in the regression models will be determined through stepwise regression (e.g. Berger and Entekhabi, 2001), a common method that involves progressively adding variables based on how well they improve the fit. This approach has the advantage that the order in which variables are added provides information on their influence. However, it may not be able to capture complex relationships between data beyond the one envisaged.
In summary, similarly to the previous streamflow signatures, it would be difficult to a-priori guess which landscape or climatic properties would influence the spatial variability of the time-to-peak in the Moselle catchment. As this signature was described primarily using multiplicative relationships in previous work, we will adopt this framework to look for suitable fitting relationships.

Relative lag
The delay demonstrated by flow as it travels downstream along a river reach is a relatively well understood phenomenon in river hydraulics. The lag between hydrographs at successive points along a channel can be modeled by the so called Saint-Venant equations, which are based on the conservation of mass and momentum in channels, and are well described in many hydrology and hydraulics textbooks (e.g. Henderson, 1966). This route, however, is both data demanding, as it requires channel geometry and roughness information, and potentially computationally intensive, depending on the simplifications one is prepared to accept. It is therefore impractical for applications such as the use of the perceptual model to inform a conceptual model, particularly for an initial stage where the main building blocks of the conceptual model are yet unrefined, and their definition in successive model development stages may require computationally intensive calibrations and comparisons of multiple model variants.
These arguments raise the question whether there are there simpler, even if not as accurate, approaches for modeling such delay. The answer may come from geomorphology, and in particular, from the scaling laws that connect hydraulic and geomorphological properties of natural rivers (e.g. Leopold and Maddock, 1953;Rodríguez-Iturbe and Rinaldo, 1997). Leopold and Maddock (1953) empirically observed that in a downstream direction, average streamflow velocity v increases following a power law relationship: where q is the average streamflow (not normalized by the catchment area). Empirical investigations returned a value of α = 0.1. This relatively low value indicates that the velocity tends to remain constant or increase slightly in the downstream direction. Considering that, at a first approximation, q increases linearly with the catchment area A (e.g. Rodríguez-Iturbe and Rinaldo, 1997) (q ∝ A), and that following a kinematic wave approximation, celerity c, hence the flood wave propagation speed, is proportional to velocity (v ∝ c) (e.g. Henderson, 1966), based on Eq.
(3), one can assume c ∝ A α and therefore: where L AB is the flow distance between two locations A and B, and A AB is the average of the catchment areas at the two locations. This relationship accounts for the fact that the celerity increases (and therefore the lag reduces) for larger catchments (which have on average larger flow).
Our proposed approach is therefore as follows: • We will identify successive gauges for which the intermediate contributing area is proportionally small and calculate the hydrograph lag. • We will regress the lags calculated for different pairs of subcatchments using Eq. (4).
In summary, in contrast to the other three streamflow signatures, the relative lag between hydrographs can be estimated using well established river hydraulics methods. A-priori unclear, however, is to what degree simplified approaches, such as relying on Eq. (4), have predictive value.

Selection of climatic and landscape indices
Our selection of climatic and landscape indices is listed in Table 2. In terms of climatic indices we restricted our selection to average precipitation and potential evaporation, where the average was extended to the streamflow data availability period for each subcatchment (see Fig. 3 in FMM2022). We did not consider snow related indices (e.g. Sawicz et al., 2011;Razavi and Coulibaly, 2013) as only a small fraction of precipitation is falling as snow (below 5% in each subcatchment), as well as indices reflecting the phase offset between precipitation and F. Fenicia and J.J. McDonnell evaporation (e.g. Berghuijs et al., 2014), as catchments are in the same climatic region, and, as already observed (Section 3.1), hydrographs have similar seasonal behaviour.
The landscape indices are extracted using the four maps described in Section 2.2 (topography, land use, lithology, and soil texture). As we use several landscape indices, to minimize confusion, we represent them with the notation L (Y) X , where L stands for landscape index, Y indicates the underlying map (Top, Lnd, Lit and Stx respectively), and X indicates the specific attribute (see Table 2 for the complete list).
Our selection of landscape indices is justified as follows. In terms of topography, indices such as L are the fractions of topographical "wetland", "hillslope" and "plateau" introduced by Gharari et al. (2011), where "wetland" is the landscape with "height above nearest drainage" lower than a specified threshold (here 10 m), and the rest of the landscape is partitioned into "hillslope" and "plateau" if the slope is higher or lower than a specified threshold (here 11%, according to Gharari et al. (2011)).
Fractions of land cover types as the ones defined here (Table 2) are used in many studies (e.g. Zhang et al., 2004;Yadav et al., 2007;Oudin et al., 2008;Razavi and Coulibaly, 2013;Kuentz et al., 2017). Land cover was also expressed by alternative metrics, such as the leaf area index (e.g. Sawicz et al., 2011;Addor et al., 2017), or metrics oriented to reflect the vegetation biomass (e.g. Jencso and McGlynn, 2011), which here we did not consider.
Lithology is often represented directly through the fraction occupied by the lithological formations (e.g. Bloomfield et al., 2009;Jencso and McGlynn, 2011;Fenicia et al., 2016;Kuentz et al., 2017). As these formations can be many, it is useful to group them into more general classes (e.g. Yadav et al., 2007;Hellebrand et al., 2008;Dal Molin et al., 2020). The formulation of suitable lithology indices is explored in Section 4.1.2.
We used soil texture for characterizing soil data (Table 2), as common to many studies (e.g. Addor et al., 2017;Kuentz et al., 2017). Alternative soil characteristics proved useful for hydrological purposes, such as the "Hydrology Of Soil Types" classification system developed for the United Kingdom (Boorman et al., 1995), and used in several studies (e.g. Schneider et al., 2007;Yadav et al., 2007;Ali et al., 2012), but such data is generally not available in other regions.
Although not necessarily exhaustive, compared to previous studies on streamflow signature variability, our selection of landscape indices is rather inclusive. For example, it has not been uncommon to base the interpretation of streamflow spatial variability on topography information alone (see Ali et al. (2012) for a review). Moreover, such indices are based on widely available information, and therefore have broad applicability.

Result and interpretations
The following sections present the identification of the process controls on the 4 identified streamflow signatures (Section 4.1), and their synthesis into a regional scale perceptual model of the Moselle catchment (Section 4.2). Appendix B includes a correlation analysis between all landscape indices, climatic indices and streamflow signatures, which provides a broad overview of various dependencies and complements the targeted analyses presented in Section 4.1. In order to capture potentially nonlinear dependencies we used the Spearman correlation r s , for which a significance p-value threshold p s < 5% is typically assumed.

Explaining variability in "streamflow average"
As described in Section 3.2.1, Eq.
(1) provides the framework to explore the variability of the long term water budget. Our approach to identify the main influence factors on the variability in Q proceeded by accounting for P, E Act and G, until a sufficient fit to the observed Q was achieved. The models here proposed were tested with least squared regression and assessed in terms of Nash and Sutcliffe efficiency, F NS .
We started with a very simple model where Q is related to P and the other terms are combined into a single constant b: This model already provided an exceptionally good performance, with F NS = 0.91. This relatively high performance indicated that the margin of improvement was small. As described in Section 3.2.1, an obvious attempt to evolve Eq. (5) was to better characterize the evaporation term, by including its potential controls. As discussed in Section 3.2.1, potential evaporation is a first order control on actual evaporation, and a simple way to account for its effect is as follows: where b in this case represents the proportionality between potential (E Pot ) and actual evaporation. This model resulted into F NS = 0.87, and therefore did not improve the fit compared to Eq. (5). Finally, we considered the Budyko model of Eq. (2), which, as discussed in Section 3.2.1, is an established approach to represent the long term catchment water balance: where b was treated as a calibration parameter. This model returned F NS = 0.90, which is intermediate between the two previous models. Fig. 6a shows the excellent correlation between Q and P (Spearman correlation r s = 0.92, and corresponding p-value p s close to zero, indicating high significance), which suggested P as the dominant control on Q, and automatically downplayed other potential controls. Fig. 6b shows the poor correlation between actual (difference between observed precipitation and streamflow long term averages) and potential  The relatively good fit obtained with the models above suggested a small influence of G. Hence, concerning the long term water balance, it appeared reasonable to assume that the subcatchments are watertight.

Influence factors on the "baseflow index" signature
The correlation between Q BFI and climatic indices, or landscape indices based on topography, land use and soil was generally low (Spearman correlation r s ≤ 0.55). We therefore embarked into the task of sorting the variability of lithology into relative permeability classes (Section 3.2.2). In particular, we defined three permeability classes of low, medium and high permeability, represented with the symbols L  (Table 2). Using expert judgment, as outlined below, we reclassified each of the lithology classes in one of these three permeability classes. Table 3 lists the lithological classes sorted by the area they occupy in the Moselle catchment. The first 15 formations already occupy 95% of the catchment. With reference to these formations, the following rationale was used for their classification: formations such as marl, schist, dolomite rock, and crystalline basement were considered as low permeability, as formed by fine particles or hard rock; formations such as sandstone, alluvium, sand, gravel and conglomerate were considered as high permeability, as notably capable of storing and transmitting water; limestone was considered as medium permeability, as intermediate between sandstone and marl; mixed formations were considered as medium permeability if the permeable formation was mentioned first (e. g. sandstone and marl), and low permeability if the impermeable formation was mentioned first (e.g. marl and sandstone). The original lithology map and the final bedrock permeability map are shown in Fig. 7. Fig. 8a shows that there was a strong and highly significant correlation (r s =0.84, p s close to zero) between baseflow index Q BFI and the fraction of bedrock with high permeability L (Lit) H . Such relationship was close to linear, as the Pearson correlation was also high (r p =0.87), and could therefore be expressed as follows: where ∝ is the proportionality sign.
This relationship lent itself to an obvious interpretation, as high bedrock permeability favors groundwater flow, which produces baseflow. An obvious attempt to improve Eq. (8) was therefore to consider also medium permeability bedrock into the equation, as this also may contribute to baseflow, although to a lesser extent. Hence, we considered the following equation: Optimizing the coefficients to minimize least squares resulted in some improvement, (r s =0.89, r p =0.90), and an optimized parameter b = 0.30 (Fig. 8b).

Influence factors on the "time-to-peak" signature
The pairwise correlation analysis in Table A1 showed that Q TTP did not correlate strongly with any of the selected indices. We therefore explored multiplicative relationships between landscape attributes using stepwise regression (as anticipated in Section 3.2.3). The Q TTP Fig. 6. Correlation between precipitation and streamflow averages (panel a), and potential and actual evaporation averages (panel b) at all subcatchments. Precipitation appears to be a strong influence factor on streamflow, whereas potential evaporation shows no influence on actual evaporation.

Table 3
Reclassification of lithology classes (sorted by decreasing area that they occupy in the Moselle catchment) into permeability classes. The area fraction indicates the "weight" of each reclassification decision. The cumulative area fraction column shows that classes 1-15 already make up for 95% of the catchment area. Low, medium and high permeability classes make up for 50%, 25% and 24% of the catchment area respectively.  Fenicia and J.J. McDonnell measured at Metz was an apparent outlier since it had a value of 86.5 h and was 4.7 standard deviations away from the mean of the other subcatchments. Therefore, we removed it from the subsequent analysis. Table 4 reports the best three empirical relations for the time-to-peak signature Q TTP obtained with one to four variables. Table 4 (first column) shows that when going from one to four variables, the Spearman correlation r s increased from 0.58 to 0.85; the p-values were well below the significance threshold already for the single variable correlations, and approached zero when multiple variables were considered. Besides the fit to the data, an important consideration in order to identify a plausible relationship was its interpretability. A relationship that appeared to fulfill such prerequisites was represented by the best performing equation (r s = 0.78) using three landscape indices: suggesting that that the time-to-peak increased with increasing distance, with decreasing slope and fraction of developed land use. The dependency on the first two variables was reflected in many time-to-peak models, as reviewed in Section 3.2.3, and the inverse dependency on developed land was intuitive, as urbanization causes faster drainage. It is also reassuring that the variables in Eq. (10) appeared recurrently also in the lower dimensional relationships shown in Table 4.

Influence factors on the "hydrographs relative lag" signature
Based on the analysis described in Section 3.2.4, we started by assuming that the lag between hydrographs at two points along the river network A and B, Q Lag , increases proportionally with the flow distance between A and B, L AB : Based on the assumption that celerity should increase slightly  Relative bedrock permeability appears to be a strong influence factor on baseflow index.

Table 4
Combination of variables that provide the best fit to the time-to-peak signature (Q TTP ). Only multiplicative types of relationships are used. The best fit using three variables suggests that Q TTP increases with geometric distance and decreases with the fraction of developed land use and with the 5% quantile of the slope.
where A AB is the average of the subcatchment areas at A and B, and the exponent α was fixed at 0.1, as suggested by previous studies (Section 3.2.4). This relationship accounted for the fact that the celerity increases (and therefore the lag reduces) for larger catchments (which have on average larger flow). These two relationships are compared in Fig. 9. The markers are color coded according to the ratio between area of the downstream and upstream subcatchments, with darker color indicating higher ratio. Darker dots should represent pairs where lateral inflow in between stations is smaller, and therefore be more representative.
The Spearman correlation coefficient r s for Eqs. (11) and (12) were 0.49 and 0.54, whereas the corresponding p-values were 18% and 14% (the Pearson linear correlations r p were higher at 0.74 and 0.79 respectively), showing that the variability of Q Lag was to a certain extent captured by the two models, and Eq. (12) was an improvement compared to Eq. (11), although considerable uncertainty remained. These results indicated that Eq. (12) provides a first order approximation of Q Lag , which spares from the use of much more complex flow routing models, when a coarse level characterization of catchment behaviour is needed, such as in the development of parsimonious and computationally efficient conceptual models.

A perceptual model of the Moselle catchment
Fig. 10 presents our perceptual model of the Moselle catchment, which synthesizes our understanding of the spatial variability of streamflow in the Moselle subcatchments and its underlying process controls, according to the perceptual model objectives represented in Fig. 3. The main traits of streamflow regional variability were distilled into four streamflow signatures, namely the streamflow average Q, the baseflow index Q BFI , the time-to-peak Q TTP , and the relative lag between hydrographs at successive stations Q Lag . These streamflow signatures appeared to respond to distinct process controls: • Differences in the amount of streamflow observed at different subcatchments were related to the spatial variability of precipitation (Section 4.1.1). Hence, our perceptual model considered spatially distributed precipitation, at the subcatchment resolution.
• Variability in baseflow vs. quickflow partitioning was mainly attributed to the influence of lithology (Section 4.1.2). Our process interpretation was that lithology affects bedrock permeability. In particular, with increasing permeability, deep percolation and groundwater flow are progressively more dominant. As a result, our perceptual model partitioned the landscape in distinct bedrock permeability classes (low, medium and high) associated to runoff generating processes that determine distinct hydrograph reactions (fast, both fast and slow, and slow respectively). • The time-to-peak could be related to topography and land use (Section 4.1.3). In particular, time-to-peak was found to increase with basin size, and to decrease with basin slope, and with a larger area of developed (e.g. urban) land. The relation with basin size and slope was consistent with most time-to-peak models (Section 3.2.3). We motivated the relationship to developed land by considering that urbanization often results into improved drainage, which causes a faster catchment response. In our perceptual model, the process controls on the time-to-peak were schematically represented through subcatchment specific lags (marked in blue in Fig. 10). • The hydrograph lag due to channel routing could be approximated using a simple relationship, as increasing with the length of a river stretch and decreasing with the average flow (Section 4.1.4). This relationship, which is based on a geomorphologic scaling law (Section 3.2.4), can be adopted as a first approximation instead of more complex (although potentially more accurate) routing models. In our perceptual model, these routing effects were schematically represented using channel reach specific lags (marked in red in Fig. 10). • Regional groundwater flow did not appear to be a dominant process (Section 4.1.1). Hence, our perceptual model purposely omitted this process and approximated the subcatchments as watertight. • Vegetation, potential evaporation, and soil did not appear to affect the variability of any of the selected streamflow signatures (Section 4.1.1 and Appendix B). Our perceptual model, therefore, did not distinguish between vegetation and soil types.
We argue that this perceptual model was not a priori obvious. In different places, streamflow spatial variability was best characterized by different streamflow signatures, which responded to distinct process controls (Gupta et al., 2013). For example, the Thur catchment in Switzerland showed strong differences in seasonality between subcatchment response (Dal Molin et al., 2020), whereas all subcatchments in the Moselle had a similar seasonal behavior. The neighboring Meuse catchment showed significant groundwater exchanges between some Fig. 9. Correlation between drainage distance between successive stream gauging station (A and B), and streamflow lag (panel a) and ratio between drainage distance and average subcatchment area (with power α = 0.1, as suggested by Leopold and Maddock (1953)) and streamflow lag (panel b). The markers are color coded according to the ratio between area of the downstream and upstream subcatchments, with darker color indicating higher ratio.
F. Fenicia and J.J. McDonnell subcatchments (Bouaziz et al., 2018), whereas in our setup groundwater exchanges were considered negligible. Moreover, as reviewed in Section 3.2, even the same streamflow signature can respond to distinct process controls in different places. For example, here we attributed baseflow variability to the permeability of lithology, whereas Tague and Grant (2004) showed that baseflow index was controlled by landscape maturity, rather than bedrock type, in the Oregon cascades, and responded to distinct controls in other areas in United States, which motivated alternative perceptual models in these distinct areas (Gnann et al., 2021). However, our Moselle perceptual model was a posteriori reasonable. The identified process controls that affected each individual streamflow signature were within the range of plausible controls based on prior knowledge, as reviewed in Section 3.2. They therefore appeared justifiable from a process based perspective.
Our perceptual model did not distinguish between soil and vegetation types, which can sound surprising, given that, for example, vegetation and soil are some key ingredients of many distributed models, such as SWAT (Arnold et al., 1998), LARSIM (Demuth and Rademacher, 2016) and PREVAH (Viviroli et al., 2009). Our analysis suggested that the considered streamflow data do not contain sufficient information to disentangle the distinct behavior of different soil or vegetation types. However, this conclusion may also have resulted from some limitations in our analysis, such as the loss of information that is unavoidable when compressing time series into streamflow signatures, or maps into landscape indices.

Discussion
Perceptual models can be the basis of several applications and can eventually advance our current thinking of hydrological processes in Fig. 10. Schematic representation of the perceptual model of the Moselle catchment. Precipitation controls the spatial variability of streamflow average, lithology influences subsurface processes and eventually the baseflow vs. quickflow partitioning, and topography and land use control hydrograph lag times. Vegetation and soil did not appear to contribute to the observed spatial variability in streamflow signatures, and the associated processes are considered spatially uniform. Groundwater exchanges between subcatchments appeared negligible in terms of the long term water balance, and are therefore omitted. different places (Wagener et al., 2020). Their development, however, has seldom been transparent and lacks clear guidance, particularly at the regional scale. This study illustrated a path to the development of a distributed perceptual model at the regional catchment scale based on an interpretation of the subcatchment responses in a nested catchment. This approach complements the traditional perceptual model development based on fieldwork observations, which is difficult to pursue beyond the headwater scale due to the disparity between measurements and processes scale (Beven, 2001a). Our approach can be classified as "top-down" in the sense that it seeks a process description "by fingering down into the (smaller-scale) processes from above" . The main advantages of this approach are that it produces a process description directly at the scale of interest, and that perceptual model decisions are supported by the available data . A specific challenge of this approach is represented by the "limits of splitting" (Beven, 1996), hence by extent to which it is possible to disaggregate catchment response into individual process controls. We showed that leveraging a nested catchment setup, it is possible to disentangle the individual controls on specific subcatchment response signatures, and inform a distributed perceptual model. Although our perceptual model is specific to the Moselle, the proposed perceptual model development approach is general, and its feasibility encourages its adoption to build regional scale perceptual models elsewhere.
The nested catchment setup stands between the individual catchment and the large catchment sample study, favoring an intermediate balance between "depth" and "breadth" (Gupta et al., 2014), hence between process understanding and amenability to generalization. While recognizing the need to go beyond the "idiosyncrasies" of individual catchments (McDonnell et al., 2007), the nested catchment setup acknowledges the importance of region specific knowledge, which is often the key to interpret the unexplained variability of large sample studies (Gnann et al., 2021). A key advantage of the nested catchment setup is that it enables control on the water exchanges between subcatchments. This control allows studying aspects such as routing processes or groundwater exchanges, which are more difficult to examine when considering catchments in isolation. For example, as noted by Beven (2001b): "the continuity equation is the most fundamental law in hydrology, but as a hypothesis it would appear that we cannot currently verify it at the catchment scale", as there is "still no way of checking whether the catchment is indeed watertight". While this statement holds true when studying catchments in isolation, the nested catchment setup facilitates such checks, as groundwater exports at some subcatchments would need to be compensated by groundwater import at some neighbouring subcatchments (Bouaziz et al., 2018).
Our proposed approach to perceptual model development critically relied on the role of the hydrological expert. This role was increasingly challenged in modern model development approaches, such as in distributed modeling where "field data collection is motivated mostly by model parameterization" (Burt and McDonnell, 2015), and machine learning techniques which challenge the assumption "that the world needs our theories and expertise" (Nearing et al., 2021). In our approach, expert knowledge was essential in the key stages of the perceptual modeling process, and in particular in: (i) Interpreting catchment response time series and help defining meaningful signatures of catchment response. Without expert judgment one can easily miss important traits of catchment response. For example, the time-to-peak signatures, which were considered important in this study, were not considered in even comprehensive collection of streamflow signatures (e.g. Yadav et al., 2007;Addor et al., 2017). (ii) Grouping spatially resolved data into classes that are potentially useful for hydrology. Readily available catchment indicators may not contain the ones that are more hydrologically meaningful (Oudin et al., 2010). In our case, for example, the key to explain the variability of baseflow index relied on an expert-driven reclassification of 31 lithology classes into three permeability classes. An analogous endeavor to reclassify data into hydrologically relevant information is represented by the Hydrology Of Soil Types (HOST) classification developed for UK (Boorman et al., 1995), where soils are classified based on their influence on hydrological processes. (iii) Distinguishing cause-effect relationships from mere correlations. Several landscape and climate properties may co-vary, which makes unpicking the right controls on signatures' spatial variability non trivial. Resorting to generic statistical or machine learning approaches can exacerbate this problem, as these methods are structurally complex and difficult to interpret. Instead, resorting to hydrological knowledge can guide towards meaningful process interpretations, both by offering prior knowledge of plausible cause and effect relationships, and by helping to refine such relationships through process based methods. In our study, the interpretation of distinct signatures relied on individual methods, which resorted to hydrological common knowledge. For example, as shown in Table A1, average streamflow correlated well not only with average precipitation (r s =0.92), but also with average elevation or the proportion of forest (r s ≥0.80). The continuity equation helped to select precipitation as the right "cause" of streamflow average variability. However, without a hydrological framework, unpicking the right control may have not been as obvious, particularly in a possible scenario where precipitation had resulted in a lower correlation than other variables because of poor data quality. (iv) Synthesizing results in the form of a perceptual model. This is a process of iteration, abstraction and synthesis, which seeks a plausible description of the overall system behavior that explains a set of individual pieces of evidence. Such synthesis process is the essence of the formulation of the perceptual model, which pieces together all the individual data interpretation analyses into a coherent interpretation of catchment behavior. In this process, the role of the hydrological expert is the key to strengthen the plausibility of a process-based explanation.
It is important to recognize that a perceptual model is subject to its own uncertainties. In particular, there are sources of uncertainty affecting the calculation of climate indices, landscape indices and streamflow signatures (Westerberg and McMillan, 2015). For example, we noted that occasionally the lithology maps presented clear discontinuities, both at the border between countries, or at the separation between adjacent maps in the same country. We also noted that occasionally, it was not possible to assess whether streamflow time series were measured in summer or winter time, which could have induced a bias when calculating lag time signatures. Errors in the rating curve could add a bias to the assessment of streamflow averages. The streamflow filtering approach and its parameterization can affect the calculation of baseflow index; however these decisions are not expected to affect significantly the relative catchment ranking, which was the primary objective of such baseflow index analyses. There are also uncertainties in the interpretation of the variability of streamflow signatures. Such interpretations often require expert decisions, which are not always obvious. Because of such sources of uncertainty, the perceptual model should be regarded as a hypothesis of catchment behavior, therefore amenable to criticism and revision.
The hydrological expertise outlined above can be contributed by hydrologists with various backgrounds. For example, both the experimentalist and the modeler can contribute to the key stages of the perceptual modeling process outlined above, showing that the "dialogue between experimentalist and modeler" (Seibert and McDonnell, 2002) can take place also beyond the headwater scale. The difference is that while at the headwater scale such dialog can follow a bottom-up approach, from the small scale measurements to the headwater scale processes, at the regional scale, because of measurements limitations, it needs to follow an opposite top-down approach, from the streamflow responses back to the processes that have generated them. As perceptual models are often the basis for conceptual models, the approach outlined here can provide a platform for a stronger engagement of the experimentalist in the distributed model development process. In other words, engagement on model structural development rather than only data collection for model parameterization (Burt and McDonnell, 2015). In FMM2022 we illustrate how the proposed perceptual model can provide useful indications on what to account for and what to ignore in the development of a conceptual distributed model.

Conclusions
This paper has presented a top-down approach for building a perceptual model of streamflow variability at the regional scale, based on the interpretation of subcatchments responses. We used the Moselle catchment with 26 gauged subcatchments, and based our perceptual model on the interpretation of streamflow regional variability. Our approach critically relied on expert knowledge which can be contributed both by the experimentalist and the modeler, and goes beyond previous dialog between experimentalist and modeler-that until now has focused largely at hillslope-and headwater catchment scales. This expert knowledge was especially helpful for "reading" streamflow time series and synthesize them in a set of streamflow signatures, for distinguishing cause-effect relationships from mere data correlations, and for ultimately synthesizing results in the form of a perceptual model. The resulting perceptual model was a-posteriori reasonable, as it could be justified by process understanding, but not a-priori obvious, given the many possible causes that can potentially affect the streamflow regional variability within a nested catchment.
In summary, our Moselle perceptual model established that streamflow regional variability could be captured by 4 streamflow signatures: the streamflow average, the baseflow index, the time-to-peak and the relative lag between hydrographs. These streamflow signatures were affected by distinct climatic or landscape characteristics and responded to different process, which could be considered "dominant" in the current context. The spatial variability of precipitation affected significantly the differences in the amount of streamflow observed at different subcatchments, which was attributed to processes associated to the long term water balance. Lithology, and in particular its relative permeability had a strong influence on the baseflow vs. quickflow partitioning. Topography and land use appeared to affect primarily the time-to-peak and the hydrograph relative lag, due to processes related to flow routing and drainage. River routing was responsible for the lag between hydrograph at successive downstream stations. The identification of all of these controls on the spatial variability of streamflow signatures downplayed other potential controls. In particular, regional groundwater flow extending beyond the subcatchment boundaries did not appear to be significant. Moreover, vegetation and soil did not appear to play a major role in explaining streamflow signatures spatial variability.
Although the perceptual model is specific to the particular Moselle context, we think that the general approach for developing this perceptual model is applicable to other nested catchments. In particular, the proposed guidelines on how to exploit expert knowledge or even frame an experimentalist-modeler dialog at the regional scale, and their employment to use the given data to interpret dominant catchment processes, may serve as a basis to build perceptual models elsewhere and with different conditions of data availability.
Finally, perceptual models are generally versatile in that they can underlie many applications. Therefore, their development deserves dedicated attention. In FMM2022 we illustrate how the proposed perceptual model can be used to inform the development of a distributed yet parsimonious hydrological model. This application, however, does not preclude the use of the perceptual model shown here for other purposes, nor its revision or refinement in future studies.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments
We thank the various data providers listed in Section 2.2. We also thank Dennis Meissner (BfG) for constructive comments of this paper, as well for sharing data and expertise on the Moselle catchment. We acknowledge Laurene Bouaziz for providing valuable input on an earlier version of this work, and for pointing us to the French data sources.

Appendix A. Definition of streamflow signatures
Streamflow average: Q (mm/yr). The streamflow average is defined as follows: where Q indicates the streamflow at an individual subcatchment, t is the time index, N T is the number of observations, and the overbar indicates the average over the observation period.
Baseflow index: Q BFI (-). The baseflow index is defined as: where Q (b) t is the baseflow at a given time. A popular filter for calculating the baseflow was proposed by Lyne and Hollick (1979): As recommended by Nathan and McMahon (1990), the filter was applied to daily streamflow, using three passes (forward, backward and forward), with the filtering parameter ϑ b set to 0.925.
Time-to-peak: Q TTP (h). In order to characterize the time-to-peak, we considered the time occurring from the beginning of the rising limb to the peak discharge, using hourly resolution streamflow data. In order to get a representative value for the entire time series, the following procedure was used: • We extracted the rising limbs of the hydrographs.
• We eliminated all rising limbs where the total rise is lower than a specified threshold ϑ q , which was set at the difference between the 95% and 5% quantiles of the streamflow. • We calculated the time length of the remaining rising limbs T 1:NR , where N R represents the number of rising limbs.
• We used the 50% quantile of T 1:NR : Q TTP = q 50 (T 1:NR ) Relative lag: Q Lag (h). In order to estimate the lag between hydrographs at successive stations (indicated with Q A and Q B ) we proceeded as follows: • We determined pair of stations by considering for each station the corresponding successive downstream station in the river network.
• We excluded pairs for which the ratio between the area of the upstream subcatchment and the area of the downstream subcatchment is lower than a threshold ϑ a , which was set to 0.3. This threshold avoids excessively large contributing area in between stations. • We calculated the Pearson cross-correlation R − NL:NL between the downstream hydrograph Q B and the lagged upstream hydrograph Q A , allowing for lags ± N L of up to ± 20 h. • We considered the lag i L that results in the maximum Pearson cross correlation Q Lag = i L such that max(R − NL:NL ) = R iL (17) • We further excluded pairs of stations for which i L was negative.
Half streamflow date: Q HSD (d). In order to confirm that the hydrographs have similar seasonal patterns, we used the half streamflow date (Court, 1962), defined as the number of days on which half of the streamflow has passed starting from the beginning of the hydrological year (i.e. 1st of September, see Section 2.2), averaged across the hydrological years. Table A1 presents the Spearman correlation matrix between landscape indices, climatic indices and streamflow signatures. In terms of correlations between streamflow signatures and other indices, Table A1 showed that (1) Q had the highest correlation with P (r s =0.92), followed by L In terms of climatic indices and their relation to landscape indices, Table A1 showed that P was positively correlated with several topography features, such as L

Table A1
Spearman correlation matrix between landscape indices, climatic indices and streamflow signatures. The matrix is symmetric and the values are reported for the lower half only.