Quantitative historical hydrology in Europe

In recent decades, the quantification of flood hydrological characteristics (peak discharge, hydrograph shape, and runoff volume) from documentary evidence has gained scientific recognition as a method to lengthen flood records of rare and extreme events. This paper describes the methodological evolution of quantitative historical hydrology under the influence of developments in hydraulics and statistics. In the 19th century, discharge calculations based on flood marks were the only source of hydrological data for engineering design, but were later left aside in favour of systematic gauge records and conventional hydrological procedures. In the last two decades, there has been growing scientific and public interest in understanding long-term patterns of rare floods, in maintaining the flood heritage and memory of extremes, and developing methods for deterministic and statistical application to different scientific and engineering problems. A compilation of 46 case studies across Europe with reconstructed discharges demonstrates that (1) in most cases present flood magnitudes are not unusual within the context of the last millennium, although recent floods may exceed past floods in some temperate European rivers (e.g. the Vltava and Po rivers); (2) the frequency of extreme floods has decreased since the 1950s, although some rivers (e.g. the Gardon and Ouse rivers) show a reactivation of rare events over the last two decades. There is a great potential for gaining understanding of individual extreme events based on a combined multiproxy approach (palaeoflood and documentary records) providing high-resolution time flood series and their environmental and climatic changes; and for developing non-systematic and non-stationary statistical models based on relations of past floods with external and internal covariates under natural low-frequency climate variability.


Introduction
Historical hydrology is the study of the hydrological cycle before the continuous instrumental recordings on the basis of highly resolved man-made documentary evidence (Brázdil et al., 2006b(Brázdil et al., , 2012)).Most of the documented pre-instrumental records refer to hydrological extremes (floods and droughts) that produced major disruption in past societies.In this context, historical floods have been frequently reported through written, pictorial and epigraphic documentation across Europe (Brázdil et al., 2006b(Brázdil et al., , 2012;;Herget, 2012).Historical hydrology is on the interface between hydrology and environmental history.Based on data derived from documentary sources (non-instrumental human observations), its analysis involves the use not only of historical-archival methods, but also of hydrological modelling and stochastic frequency analysis.Traditionally, the collection of historical flood information has been mainly addressed within the field of historical climatology together with other natural phenomena such as heavy rains, windstorms, snowfalls and droughts (Brázdil et al., 2005b).Over the last 20 years, the study of historical floods has gained recognition in Europe as key to understanding the natural hazards' dynamics and their response to climate variability.Some major efforts were made in the topic within the European SPHERE (Systematic, Palaeoflood and Historical data for improvEment of flood Risk Estimation; Benito et al., 2004) projects and the FLOODCHANGE Advance Grant (Deciphering River Flood Change; Kiss et al., 2015).The possibility of extending river records towards the past has opened new perspectives in the study of extreme hydrological events whose analysis in terms of return periods, variability and tendency to clustering requires long hydrological data sets (Hall et al., 2014).However, documentary flood data are, in most cases, descriptive information, limited to a location (at human settlements), depending upon human perception (caused damages) and eventually bias by the political, legislative and administrative (local, regional and national) contexts.Recent advances in hydrological and hydraulic modelling and statistical-mathematical methods allow better dealing with the uncertain and categorical data characteristic of historical floods, developing new applications in the study of flood hazards and climate change studies.
In Europe, historical documentary flood sources go back to Roman times (Camuffo and Enzi, 1996), although continuous and homogeneous written archives are known to be available only for the past 500 years (Brázdil et al., 2005a;Glaser et al., 2010).European richness in flood historical documents is only comparable to China, with 100 000 reports from 8000 localities, although quantitative description of the flood hydrographs in Chinese rivers did not start till the 18th century (Luo, 1987).Chinese documents on historical floods date back in some cases to 2000 years ago, with detailed descriptions over the recent 600 years and complete and homogenous data over the last 200 years (Luo, 1987).The oldest historic nearly continuous flood record is not to be found in Europe or Asia, but in Africa, in ancient Egypt (Popper, 1951), whose economic wealth depended on the flooding of the River Nile and the annual deposition of fertile sediments along the river floodplain.In the Nilometer of Egypt, flood levels were observed and measured since 3000 BC.In the USA, historical floods include information about extreme hydrological events observed on staff gauges (water-level readings), and therefore data derived from nonrecording, attendant-read staff gauges are traditionally considered historical data (Cook, 1987).Early stages of hydrology in Europe (1780-1860) involved human observations of water-level readings on staff gauges, which were not always continuous throughout the year.This problem was gradually solved with the modernisation of gauge stations that made it possible to record continuous flow stage on counters and data loggers.This early instrumental period, which started around the 18th century and comprehends the initial flood observations at staff gauges, has been considered part of the historical hydrology (Brázdil et al., 2012).
Other sources of long-term pre-instrumental flood data, though sometimes with a lower time resolution, are sedimentary and botanical records, known as palaeofloods (cf. Baker, 2008;Fig. 1).Palaeostage indicators include various types of geologic evidence (flood deposits and geomorphic features) and woody debris, as well as morphologies related to direct physical damages on riverine vegetation (e.g.scars on trees and tilting; Herget, 2012;Benito and Díez-Herrero, 2015).A number of studies have combined both historical and geological indicators to ascertain the magnitude and frequency of past flood, increasing the robustness of the frequency analysis of rare floods (Benito et al., 2010).
This paper aims to describe the different techniques and approaches used in order to obtain quantitative information from historical flood data, as well as to draw attention to its different scientific and engineering applications.The specific targets are (1) to describe the historical flood data sources leading to robust estimations of long-term flood discharge records, (2) to review different techniques used for reconstructing the magnitude and frequency of specific past floods using documentary evidence, (3) to describe the approaches used in flood frequency analysis with historical discharges, and (4) to illustrate how historical quantitative hydrology can contribute to the solution of environmental and engineering problems.

Quantitative historical hydrology
The primary goal of historical hydrology is to collect information on past extreme floods such as date, relative magnitude, damages and socio-economic impacts at the time (Brázdil et al., 2006b(Brázdil et al., , 2012;;Glaser et al., 2010).Most recently, there has been a growing interest in quantifying these descriptive data in the reconstruction of flow depths, discharges and hydraulic properties associated with historical flooding (Fig. 1).This quantification may be numerical (peak flow) or categorical (damage classification).Regarding numerical data, the observed flood water levels associated with a given historical flood can be transformed by hydraulic calculations into velocities and discharges in a procedure analogous to the depth-discharge relationships used to determine the flow rate in gauge stations (Cook, 1987;Benito et al., 2004;Herget et al., 2014).Flood magnitudes can also be classified in terms of resulting damages or social impacts (e.g.Sturm et al., 2001).The use of secular records for the analysis of possible changes in the magnitude and frequency of individual floods at specific sites or for specific rivers can support not only flood hazard assessment analysis, but also enable the identification of interconnections between flood frequency and severity and climate, land use and river morphology (Macdonald and Black, 2010).Moreover, historical flood data often include other less commonly used information and data regarding the societal and economic consequences of these natural disasters (Coeur, 2003).Yet, these valuable data on the role of floods, through time, in local and national societal and policy changes are still an unexplored field that can bring new insights into the public perception of risk.
The use of documentary flood data in hydrological studies usually comprises four phases of analysis: (1) compilation and assessment of flood dates and water levels; (2) classifica- Organization of historical and palaeoflood data, using the described thresholds (X i ), and multiple types of observations to support flood frequency analysis.K i corresponds to the number of flood peaks during the last n i years that exceeded the X i threshold but not the X i−1 threshold.Upper bound level (X u ) may be used to limit the maximum discharge.Data types: E: flood peak is known.LB: flood was bigger than X i which is known; UB: the upper flood level of known magnitude (X u ) was not exceeded over a certain time period.DB: flow level was within the interval given by X u and X i .(c) Data source characteristics, timing, stage information, and typical temporal framework of systematic (instrumental) and non-systematic data (palaeoflood and documentary evidence).Modified after Benito and O'Connor (2013).
tion of events according to flood water level (exact stage), described inundation zones (minimum or maximum flood level) and from reported damages; (3) estimation of flood magnitude, usually peak discharge, associated with documented and site-observed evidence; and finally (4) use of historical flood data in the flood frequency analysis.The implementation of the first two initial analysis steps entails a previous command in historical archive research, both written and cartographical, that will not only produce a record of historical floods, but will also identify flooded sites and morphological changes in river channels and floodplains in the course of the historical analysed period.The third task requires the implementation of hydraulic and hydrodynamic analyses, mainly drawn from engineering applications, to assign a flow magnitude derived from documentary evidence.Finally, historical estimated discharge data can be merged with instrumental records in a flood frequency analysis in order to determine discharges associated with probability quantiles.

History of hydraulics and early flood estimations in Europe
The first hydraulic parameter described in rivers was the water level reached during extraordinary events.Ancient Romans observed and recorded flood water levels at bridges (e.g.Albenga, 1940), and the Roman engineers used this knowledge for designing their infrastructures (Lorenz and Wolfram, 2011).The study of flood levels was important, ensuring efficient and long-lasting hydraulic structures, and leaves this legacy to future generations, as can be read in the inscription placed at the 30 m high Alcantara Bridge (Spain): "Pontem perpetui mansurum in saecula mundi" (bridge to last forever in the perpetuity of the world; Fernández-Casado, 2008).Nowadays, the preservation of written records about flood data from antiquity is anecdotal since most written records were destroyed.During Medieval times, water marks associated with large floods were made on bridges, houses and even on bedrock outcrops, although in a non-systematic way (Brázdil, 1998;Deutsch and Pörtge, 2009;Brázdil et al., 2012).Systematic water-level readings at gauges did not start before the later 18th century, namely in Germany, France, Austria and Czech lands.A review of the history of these first gauge water-level readings in Europe was compiled and published by Brázdil et al. (2012), though at the national level several papers have been published recently regarding the development of discharge measurements (e.g. for northern Germany, Deutsch, 2010).The revolutionary step in flood hydrology records which enabled one to register and calculate flow discharge after stage level at gauge stations did not take place till the 19th century, with the measurement of flow velocity.The first rotor current meter was developed by Woltman (1790) and underwent several improvements during the 19th and 20th centuries (Lanser, 1953).Current-meter gauging stations permitted the measurement of the flow rating curve (depth-discharge relationship) at the first established gauge stations in European rivers (Fig. 2).Rating curves were established and rated most reliably for low-to-moderate flows.The extreme flow discharges were frequently obtained from extrapolation of the rating curve.As this approach is less reliable and implies numerous uncertainties due to missing calibration, discharges associated with high flood levels were estimated by hydraulic formulae.Hence, reconstructions of historical floods gained robustness simultaneously to the advance in hydraulic research.
The first equations accounting for resistance law in open channels were established during the late 18th and 19th centuries.In 1775 the French engineer Antonie Chézy (1718-1789) proposed the first resistance formula based on a study of the water transfer from the River Yvette to Paris by an earthen canal (Herschel, 1897).Chézy's formula can be derived mathematically from two assumptions, as described by Chow (1959).First, Chézy assumed that the force resisting the flow due to friction per unit area is proportional to the square of the velocity V 2 multiplied by a constant of proportionality K, the length of the canal L, and the perimeter P of the section in contact with the water, i.e.KV 2 LP.The second assumption in Chézy's formula is equality of the total force of resistance to the effective gravity-force component which is parallel to the channel bottom, namely ωALS, where ω is the unit weight of water, A is the cross-section area, L is the channel length and S the slope.Since ωALS = KV 2 P L, where A/P is the hydraulic radius R and √ ω/K can be replaced by a factor C, Chézy's equation is then obtained as (1) In this formula, the factor C is the main uncertainty for velocity calculations estimated for known river cross sections by indirect methods or assumptions.For instance, the first discharge estimates of the 1857 flood in the Ardèche (France) were obtained from multiplying the calculated velocity by 0.7 to reflect the unequal distribution of flow rate and channel roughness (De Mardigny, 1860).During the 19th century different experiments were performed to determine the involved variables in Chézy's factor C, among which the most relevant were proposed by Ganguillet and Kutter (1869) and Bazin (1897).Henri Emile Bazin (1829Bazin ( -1917) ) conducted laboratory studies on channels made on cement, brick, wood and rock, proposing a formula where C is a factor of R: where m is a roughness coefficient that varies between 0.06 for canals made of concrete and 3.17 for earth channels with rough conditions (Bazin, 1897).
In 1868, Philippe Gascard Gauckler , the engineer at Ponts et Chaussées, proposed two formulae for the estimation of the flow velocity V as where λ 1 and λ 2 are coefficients describing the boundary roughness.The second formula (Eq.4) applies for the laminar flow regime that may also be expressed as λ 2 = 1/n, n being a roughness parameter, as was proposed later by the Irish engineer Robert Manning (1816-1897), although apparently Manning was unaware of Gauckler's work.The popular Manning equation (Manning, 1891) is expressed in metric units as where K is a factor of flow resistance that later was modified to 1/n, where n is known as Manning's roughness coefficient.Later, Strickler (1923) proposed a new expression of the Chézy C coefficient C = KR 1/6 that applied in the Chézy formula provides a similar expression of the Gauckler-Manning formula (Eq.5).There is still an open debate on the significance of the different contributions to the still recently frequently applied approach (Williams, 1970;Dooge, 1992;Hager, 2005).Consequently, this equation is also named the Gauckler-Manning-Strickler formula.Another fundamental set of hydraulic equations for unsteady open channel flow was formulated by Barré de Saint-Venant in 1843, who published the correct derivation of the Navier-Stokes equations identifying the coefficient of viscosity and its role in the local acceleration and fluid turbulence (Anderson, 1997).
The Gauckler-Manning formula has been used extensively since early estimates of historical floods to calculate the mean flow velocity at sections with observed historical flood levels (Pardé, 1925b).During the first half of the 20th century the research on historical floods was developed with a great influence of physical geographers.Maurice Pardé (1893Pardé ( -1973)), professor of potomology in Grenoble, was probably the most prolific European author in the study of extreme historical floods, with over 300 papers and 2000 handwritten notes and letters on the subject including a vast compilation of documentary floods worldwide.In Austria, the first known publications calculating discharges using historic flood marks were carried out by Schwarzl (1956) and Kresser (1950Kresser ( , 1957)).The highest flood level marked on public buildings and passages near the river corresponds to the 1501 flood estimated as ca.14 000 m 3 s −1 in Engelhartszell in Upper Austria (Kresser, 1957).In Italy, early discharge estimations from historical flood levels were carried out by engineers at the service of water authorities, and data were reported as internal publications (e.g.HOPR, 1935;Giovannelli andAllodi, 1960s, cited in Zanchettin et al., 2008).In the River Po, the 1857, 1868, and 1872 flood levels were studied in the context of large flooding recorded during the early decades of the 20th century (Visentini, 1936(Visentini, , 1938;;Visentini and Pardé, 1936), concluding that historical flood peaks were of a lower magnitude than contemporaneous extreme events.
These early 20th century advances in hydraulic models and stream flow measurements supported the re-estimations of discharge associated with certain large historical floods.For instance, in the River Isère the 1740 flood discharge originally estimated as 1844 m 3 s −1 by the engineer Christophe Dausse in 1780 (Lang et al., 2003) was later revised in 2000 m 3 s −1 by Pardé (1925b) using Chézy and Gauckler equations.Pardé (1961) provided the first worldwide inventory of flood discharges including a large number of pre-instrumental floods, many obtained from letters and unpublished reports with a reliability difficult to verify.The compilation of discharges from these early historical flood studies resulted in the plotting of regional envelope curves of maximum flood peaks or discharge per unit area versus drainage area (Wundt, 1949;Pardé, 1961;Francou and Rodier, 1967).The envelope curve is a deterministic method to transpose space for time under the assumption that there is a limit of precipitation supplied to a basin under a given climatic and geographic domain (Myers, 1967).
The study of past floods declined since the mid-20th century as databases of stream flow measurements supported by statistical analysis provided standard hydrological methods for flood hazard applications on which historical extremes were considered anecdotal, imprecise and outliers in relation to systematic gauged records (Klemeš, 1989).

Quantitative historical flood records: approaches and methodology
Most of the early historical flood discharge estimates were obtained from flood marks using hydraulic equations under the assumption of uniform flow conditions.However, there is a large amount of documentary evidence providing descriptive evidence of past inundation levels and flood damages (impacts) that offer a quantitative reference to the associated flood discharges after a critical analysis and interpretation (Benito et al., 2004).For the past 20 years, there has been a growing interest in reconstructing flood chronologies and their discharge estimates from documentary descriptions of flood water level (Benito et al., 2003a).The reconstruction of long historical flood records from documentary sources relies upon the availability of historic data for model implementation and calibration, uncertainty in the past river topography, and detailed configuration of the river channel and vegetation for roughness characterisation.Extracting quantitative hydrological data from documentary evidence leads typically to two phases of analysis: (1) documentation and assessment of documentary evidence of flood stage during specific historical floods and (2) relating of identified flood evidence to flood discharge, based on hydraulic calculations.Documentary evidence of flood water levels includes flood marks, with an indication of single or multiple flood levels (Fig. 3) and relative water levels obtained from narrative descriptions from places affected by flooding (e.g. a church, one of several streets, singular buildings).By the computation from water level to discharge, much correlative water surface evidence (marks, inundated sites) of a historical flood is matched to a water surface profile of a known discharge, obtained from hydraulic modelling.A major problem for this hydraulic analysis is the reconstruction of river channel geometry at the time of flooding, which produces a high uncertainty in the discharge calculations mainly in alluvial rivers (i.e.mobile river bed).The existence of historical maps showing the channel morphology at the flood time can be used to reconstruct its former morphology.The historical flood estimated discharges are then structured into different threshold levels that were exceeded by floodwaters over specific periods of time, the input data necessary for flood frequency analysis (Fig. 1b).

Documentary data sources and types
Several review papers have described the main data sources of historical hydrology (Brázdil et al., 2006b(Brázdil et al., , 2012)).They are often grouped into three data categories: handwritten documents (e.g.narrative sources, administrative and ecclesiastical reports, and personal correspondence); printed sources (special print newspapers, reports and technical papers); iconographic sources (stone marks, historical photographs and paintings, old cartography and cross sections).
Narrative descriptions about floods may be biased by perception, both from the writer and from the present-day researcher (Brázdil et al., 2006b).Since information from documentary sources is mainly qualitative, several classifications have been suggested taking into account the severity of flood impacts (Sturm et al., 2001).Barriendos et al. (2003) proposed a qualitative classification of flood severity for records prior to the pre-instrumental period, taking as reference the channel overflow: ordinary flood -when water remains within the channel and banks; extraordinary flood -resulting in localised overbank flow, with any damage but without major destruction; and catastrophic flood -with in-undation resulting in general damage and destruction of infrastructures.
A study of historical hydrology involves (1) collection of documentary sources (e.g.municipal, ecclesiastical and private archives) and consideration of already compiled information (e.g.books, databases and reports); (2) compilation of instrumental data for the area of investigation (rainfall, flow and level of the river, synoptic information); (3) collection of graphic information (e.g.paintings, photographs); (4) crossreference of historical and palaeoflood (geological records) information, instrumental data and graphic information.It is convenient to produce a standardised data form to be filled for each documentary flood reference (Fernandez de Villalta et al., 2001;Casas et al., 2003;Barriendos et al., 2003;Barriendos et al., 2014).
As mentioned above, early studies on historical floods were carried out mainly at sites with flood marks, for which peak flow values were calculated.However, most of the documented historical flood information is not recorded as engraved flood marks (plates or inscriptions) but as descriptions of inundations that affected out-of-channel areas.This is the case for instance of the River Ter (north-eastern Spain) for which a record of 170 floods was compiled for the period between 1322 and 1987 (Barriendos and Martin-Vide, 1998), from which only 77 floods were found to be registered in plates or wall inscriptions.The majority of the flood events mentioned in documentary sources were nevertheless associated with a comprehensive description of the sites, or streets affected by the flood water level.The reconstruction of flood discharges from descriptive flood levels beyond rough extrapolations and estimations (e.g.Schiller, 1987) was addressed firstly for the River Tagus in central Spain (Benito et al., 2003a).Discharges associated with documentary-based floods were reconstructed at four places, namely in Aranjuez (since AD 1557), Toledo (AD 1113), Talavera (AD 1203) and Alcántara (AD 1856).The hydraulic analysis and interpretation of the flood level from historical documents was inspired by methods commonly used in palaeoflood hydrology (Baker, 2008).Flood levels associated with documentary data at these locations include (1) flood marks on houses, mills, monasteries and bridges; (2) descriptions of flooded areas such as orchards, roads and streets; (3) descriptions of non-flooded areas (e.g.singular building surrounded by water but non-inundated); (4) relative flood level with respect to previous floods (e.g. the 1840 flood was 2 m higher than the flood occurring in 1820).The interpretation of these flood water-level indicators provides four different discharge information records: (1) highest water level or peak discharge (equal to the flood stage), (2) minimum flood discharge, (3) maximum flood discharge, and (4) discharge quoted as a range in the case of two recorded levels.Field work is required to accurately locate (GPS survey) the sites referred in the historical documents (location of buildings, streets, bridges, gates, walls, etc.), as well as to ascertain the altitudes of the referred flood stages or levels.Thus, for all doc-Table 1. Summary of historical discharge estimates for selected rivers in Europe.Only sites with multiple historical flood records in the recent literature are included.Additional discharge data of individual pre-instrumental floods can be found in Pardé (1961) and Francou and Rodier (1967).Legend: 1: main documentary data source: EPM -epigraphic marks; O -observed flood levels; AD -archival descriptions; PFDpalaeoflood deposits; FC -flood categories.2: discharge estimate method: RC -from rating curve; GM -Gauckler-Manning equation; CEq -Chézy equation; SBW -step-backwater method; CD -critical depth; 2D-M -bidimensional model; WLT -water-level transposition from the nearby gauge station; RRM -rainfall-runoff model; STA -statistical method.Flood discharge re-estimated in 3900 m 3 s −1 (Kirsch and Pohl, 2011).umentary evidence along the study reach, the flood height can be estimated and the associated flood discharge reconstructed (Fig. 4).Similar studies have been conducted later in other European sites (Table 1) with well documented floods and a rich historical archive.New methodological approaches were also developed as in the case of the study conducted by Roggenkamp and Herget (2014) for the River Ahr at Ahrweiler (Germany).The hydrograph of the 1910 flood was reconstructed based on sequenced historic photographs showing the same inundated street with a street clock hanging on a wall of a building, which precisely linked time and flood water level.

Discharge estimation from documentary records
The most critical component of applied historical flood hydrology is the estimation of discharge associated with documented floods.The flow estimates from hydraulic analysis is usually based on the elevation of flooded or nonflooded sites and epigraphic marks relative to local channel geometry.The approaches used to assess discharge estimation from known flood water levels vary from simple hydraulic formula to the results derived from the running of one or multi-dimensional hydraulic models (Kutija, 2003;Lang et al., 2004).Most historical flood studies assume a one-dimensional flow with calculations based on (1) uniform flow equations (e.g.Gauckler-Manning equation), (2) gradually varied flow models (e.g. standard step method calcu- lation), and (3) one dimensional Saint-Venant equations.In complex reaches, multi-dimensional modelling may reduce uncertainties associated with reconstructing flood discharge (Denlinger et al., 2002).The Gauckler-Manning equation is applied for uniform, steady and one-dimensional flow conditions of straight channels of even gradient and regular width (Chow, 1959).In most hydraulic computations, the result of the calculated dis-charge depends on the uncertainty in the selection of the roughness parameter, changes on cross-section topography and urban configuration on the floodplains since historic times.The typical method for estimating Manning's n is obtained from reference tables, from examination of photographs of typical channels whose roughness coefficients are known, or based on the experience of the researcher in similar river settings.Herget et al. (2014) proposed a method based on the Manning equation in which discharge is calculated separately for individual homogeneous units of the inundated cross-section area.At each sub-section, the intervenient parameters of the Manning equation (R, S, n) are assessed at the time of the historical flooding based on old maps and written descriptions.The uncertainty on the estimation of roughness and of hydraulic geometry is introduced as different scenarios based on varying assumptions considered.Herget and Meurs (2010) applied this method to the 1374 flood of the River Rhine in Cologne, the highest in the local record.The calculated discharge was validated by application of this equation to recent floods and comparison of results with nearby cross sections.Since flow in natural channels is typically not uniform, large errors can be expected when the Gauckler-Manning equation is applied to a single flood mark and one cross section.The separation of the cross-section area into more or less homogenous units reduces this problem significantly.This approach cannot be used for floods caused by ice jam or those with temporal bridge obstruction by woody debris raising the flow level instead of an increased discharge (Herget et al., 2014).
The most common historical flood discharge calculations are applied to gradually varied flow conditions (Benito et al., 2003a;Lang et al., 2004;Naulet et al., 2005).River channel geometry is generally irregular in shape and surface roughness, resulting in non-uniform flow conditions.Gradually varied flow analyses usually assume a steady state (constant discharge) for which flow depth varies with distance but not with time (Chow, 1959).The typical approach relating historical flood evidence to discharge uses the step-backwater method for gradually varied water-surface profile computation (Benito et al., 2003a).In this method, water-surface profiles are calculated from the resolution of the conservation of mass and energy equations in their one-dimensional forms.Available public-domain computer routines, such as the US Army Corps of Engineers HEC-RAS (Hydrologic Engineering Center, 2010), provide computation of water-surface profiles for specified discharges, and energy loss coefficients.Multiple analyses give synthetic rating curves at sites of interest, thus providing a basis for calculating historical flood discharge from the elevation of a water mark, known inundated locations or other high-water evidence (Thorndycraft et al., 2006).Uncertainties in flow modelling variables can be assessed for their resulting influence on historical flood discharges by testing outcomes of plausible ranges of Manning's n values and possible changes in channel geometry.Challenging for this approach is the demand for several quan-tified cross sections along a valley which are usually hard to determine from historic descriptions.Consequently, the variation of the geometry and roughness parameters along a valley can only be assumed.
Recent advances in two-dimensional computing flow hydraulics (Kutija, 2003) have been considered for historical flood studies (Fernandez Bono and Grau-Gimeno, 2003;Calenda et al., 2005).In alluvial rivers, flows over the banks show a three-dimensional behaviour, and this should be analysed by two-/three-dimensional models.However, already even two-dimensional modelling requires a large amount of high-resolution channel and floodplain topographic information to define the working mesh as well as detailed data about changes in historical topography after construction of buildings and roads, as well as spatial variability of roughness.
Flood hydrographs are essential for different engineering applications including dam operation and safety (Swain et al., 2006).The few essays to obtain hydrographs from palaeoflood studies have used probabilistic hydrographs (England et al., 2003;Benito et al., 2011).Recently, Elleder (2010) reconstructed the February 1784 flood of the River Vltava in Prague based on peak flood marks, daily newspapers and explanatory notes accompanying early instrumental measurements on the Klementinum observatory.The hydrograph showed only a 45 h time to peak in Prague with a 4 m waterlevel rise during 12 h, a steep rise exceeded only by the August 2002 flood (Brázdil et al., 2005a(Brázdil et al., , 2006a) (Fig. 5).

Assumptions and uncertainty evaluation of the estimated historical discharges
The reconstruction of historical flood records is subject to assumptions, limitations and uncertainties that may affect the interpretation of the number of floods and estimated discharge.A key element in this quantitative analysis is the transformation of known information of flow level to accurate discharge estimates.In this task, it is of critical importance to confirm that the identified flood marks and sites used as flood level indicators are not removed since the time of flood event.Previous experience shows that (1) epigraphic marks could be easily removed from the original site during restoration works (Fig. 6); (2) some original land mark (street, wall, or floor) could be buried or their names changed (Deutsch et al., 2006;Munzar et al., 2006;Macdonald, 2007).Non-typical examples of flood marks are signs of flood levels recorded on the River Vltava in Prague with respect to the head of "bearded man" (in Czech Bradáč) (Elleder, 2003;Brázdil et al., 2005a) or for the River Elbe at Děčín close to the Czech-German border located on the castle rock (Brázdil et al., 2005a;Kotyza, 2006).
A second set of uncertainties is related to the hydraulic setting and transformation of water level into discharge.The hydraulic calculations assume a precise characterisation of the channel geometry which remains invariant during the flood event and, in most cases, steady flow in subcritical flow conditions.In a given cross section, the portions of effective flow (flow in the downstream direction) should be distinguished from regions of the channel that do not convey discharge downstream (e.g.eddy flows).Ideally, the model should be calibrated using known water surface elevation and discharges from contemporary floods, and if necessary carry out changes according to the historical vegetation and past urban configuration.Although the discharge estimation can be made on the basis of a single historical mark or flood evidence, confidence in the discharge determination is enhanced when calculated water surface profiles are matched by several flood marks or other inundation references along the study reach (Machado et al., 2015).
The effect of bridges, channel constrictions and obstacles in general, if they get blocked by ice jams or woody debris during the flooding, constitute another issue to be considered during the hydraulic modelling implementation  (Herget, 2012).This confirms the necessity to work only with the original position of flood marks and to use multiple documentary evidence to reconstruct flood levels.
(Fig. 5a).This blocking is likely to produce a back-flooding effect raising the flood level upstream.River lining and encroachment of the river bank may vary the floodway area and change the hydraulic conditions through time of referred flood marks.For instance, the Danube inundated ca.1000 km 2 of floodplains during the September 1899 flood, whereas flood storage during the June 2013 flood was only a few hundreds of km 2 , producing significant effects on the flood peak discharges (Blöschl et al., 2013).Note that the largest pan-European flood event of February-March 1784 was also caused by a sudden release of water from local ice jams (Brázdil et al., 2010) (Fig. 5).
Assumptions concerning the hydraulic method and models applied to calculate discharge, type of flow (uniform versus non-uniform), the effective flow area and choice of energyloss coefficients cause uncertainty in discharge estimates.For instance, in the River Elbe in Dresden, the official peak discharge of the 1845 flood is 5700 m 3 s −1 , whereas the water profile calculations by means of one-and two-dimensional hydraulic models provide a value of 4335 m 3 s −1 , a discrepancy attributed to an inaccurate stage-discharge relationship at the gauge (Pohl, 2008).

Flood frequency analysis
A fundamental problem in flood hydrology is the analysis of the flood frequency or discharge corresponding to an occurrence interval (return period).This estimation is necessary to the correct design and location of structures (dams, bridges, industrial buildings) and in the flood hazard mapping.The statistical analysis of extreme values has been highly improved since earlier work by Foster (1924), describing the application of frequency curves to engineering problems.Fisher and Tippett (1928) developed frequency distributions of maximum values, subsequently applied by Gumbel (1945) to floods.The flood frequency analysis (FFA) was presented as a replicable method for quantification of uncertainty based on a large number of flood data.There are several important problems in applied flood statistics to the study of large floods.The first concern is the complexity of natural phenomena and the second is the assumption that data collected on river gauges are representative of the largest and rarest floods (Baker, 1994).The design engineer Vance A. Myers (1967) highlighted the consequences of using FFA methods with short flow measurements for dam design: "In reading the early reports one can sense a confidence by the less cautious that the flood record was stable, that nature had shown what she could do on a particular stream in a relatively few decades.This confidence was later found to be misplaced.The more cautious showed a feeling that major floods were among the imponderables, whose evaluation was impossible by the techniques then available.Some earth dams built during this period have failed due to insufficient spillway capacity".The design of sensible infrastructures was highly improved when historical flood data were considered.For instance, the spillway capacity of the Saucelle (13 282 m 3 s −1 ) and Aldeadávila (12 500 m 3 s −1 ) dams in the River Duero (Spain) were designed on the basis of a deterministic application of reconstructed historical discharges from the 1597, 1739 and 1909 flood marks (Rodríguez-Marquina, 1949a, b).
The use of historical floods for FFA has been more frequent since pioneer publications by Benson (1950) and Leese (1973) incorporating non-systematic (historical) data together with gauge records.Documentary data are particularly valuable where there is an account of all floods exceeding a certain stage (threshold), or censured level, over a long period prior the instrumental gauging (Fig. 1).Generally, this minimum flood level required to assure documentary evidence of flooding is related with a perception threshold to which the contemporary society was susceptible in terms of damage or social disruption (Stedinger and Cohn, 1986;Francés et al., 1994).This perception threshold is frequently related to a flood water level within urban zones and build-

G. Benito et al.: Quantitative historical hydrology in Europe
ings with distinct characteristics (e.g. market, bridge, church) (Barriendos et al., 2003).The most common approach assumes that each flood exceeding this threshold has been recorded in the documentary record (Fig. 1a, b).For instance, flooding of the Aranjuez Royal gardens (Spain) is produced when the River Tagus overtopped the river banks during discharges exceeding 300 m 3 s −1 (Benito et al., 2003a).A list of k i observations above an arbitrary specified discharge threshold X i in n i years is similar to the analysis of partial duration series (data censored above threshold; Stedinger and Cohn, 1986;Francés et al., 1994;Francés, 2001).Statistically it is important to confirm that years with lack of a historical flood record really corresponded to flows smaller than the discharge threshold X i .The threshold level of flood perception may vary through time with regards to various human activities and occupation of riverside areas.Naulet et al. (2001) classified the documentary flood data into four types (Fig. 1b): (1) exact type when flood discharge is known (e.g.water mark); (2) lower bound type if we know that the flood level was higher than a lower bound (X i ), which is known; (3) upper bound type if it is only known that the flood at time t was smaller than X u , which is the upper bound; and (4) double bound type if it is known that flood discharge was bracketed by a double bound where X i and X u are known.These historical flood data (known as non-systematic) can be combined with systematic annual data from the gauge stations.FFA commonly uses parametric models (defined finite number of parameters) combining a cumulative probability distribution function and a parameter estimation method (Stedinger and Cohn, 1986).Most of the distribution functions (Gumbel, Log-Person, GEV) that are used in conventional FFA has been applied with historical data (Stedinger et al., 1993).Several methods have been used in the estimation of the statistical parameters for the selected distribution functions (Strupczewski et al., 2014).The most efficient methods to incorporate imprecise and categorical data are (1) maximum likelihood estimators (Leese, 1973;Stedinger and Cohn, 1986;Francés, 2001); (2) the method of expected moments (Cohn et al., 1997;England et al., 2003); and (3) Bayesian methods (Kuczera, 1999;O'Connell et al., 2002;O'Connell, 2005;Reis and Stedinger, 2005).Several reviews of these methods have been published by Stedinger et al. (1993) and Francés (2004), and case study applications in Europe can be found, among others in Calenda et al. (2009) and Botero and Francés (2010).
A recent review by Kjeldsen et al. (2014) observed a scarce use of historical data for frequency estimates in 16 countries of Europe, identifying three main reasons: (1) the lack of unified database depositories, (2) uncertainty associated with discharge estimates, (3) concerns about violation of stationary assumption when using historical data, i.e. annual probabilities are equated to historical frequencies of occurrence.Concerning the second item, Macdonald et al. (2014) showed that frequency analysis using exact discharges or minimum discharges exceeded by the historical event has almost the same uncertainty value.Viglione et al. (2013) demonstrated that the number of floods exceeding the perception threshold is more important than the uncertainty on discharge value A reduction on the error is obtained for a return period of the largest historical flood about twice length of the pre-instrumental record (Strupczewski et al., 2014).Many documentary-based flood studies shown that flood frequency has been influenced by the internal variability of atmospheric circulation, with flood clusters at some time periods (Glaser et al., 2010), or by impacts on the environmental patterns such as land use (Benito et al., 2010) and engineering works (Machado et al., 2015).A simple test of stationarity for censored samples (systematic and/or non-systematic) was proposed by Lang et al. (1999) assuming that the flood series can be described by a homogenous Poisson process (Naulet et al., 2005).It is recommended to select a sample above a high discharge threshold in order to produce an exhaustive and homogeneous set, avoiding bias in relation with archive availability or flood risk exposure (Barriendos et al., 2003).Macdonald et al. (2014) analysed the sensitivity of the application of different discharge thresholds, showing that the selection of a high discharge threshold decreased the uncertainty for high-magnitude flood estimation.

Discharge of historical floods in the context of instrumental records
Quantitative historical hydrology provides a multi-centennial perspective of extreme flood magnitudes.In Europe, there is a long tradition in the study of historical floods in the context of historical climatology (Brázdil et al., 2005b;Glaser et al., 2010), although reconstruction of peak discharges associated with documentary data is still scarce (Fig. 7).Table 1 shows a compilation of 46 case studies with historical discharge estimates at sites with multiple floods with discharge estimates published in peer review papers or being cross-checked with the original historical sources.Numerous studies suggest that current flood magnitudes are not unusual within the context of last 1000 years, with good examples for the rivers Rhine (Herget and Meurs, 2010;Wetter et al., 2011), Tiber (Calenda et al., 2005), Llobregat (Thorndycraft et al., 2005), Trent (Macdonald, 2013) and Gardon (Sheffer et al., 2008;Neppel et al., 2010).In general, the largest historical floods from the last 500 years show higher peak flows than the largest gauged floods (Fig. 7).The largest difference in discharge between historical and gauged records is mainly characteristic for small catchments in mountain basins and in Mediterranean rivers (e.g.Llobregat, Ter, Ticino, Tiber and Isère rivers).In these regions, the knowledge of historical peak flows may provide important insight in flood hazard prevention.For example, the 2002 flood of River Gardon (France), that claimed the lives of 23 people and cause Euro 1.2 billion worth of In some cases, recent flooding in central and northern Europe reached similar magnitude or even higher than those reconstructed from documentary records, such as the River Vltava in Prague (Brázdil et al., 2005a;Elleder et al., 2013), and the records in the lower River Po (HOPR, 1935;Zanchettin et al., 2008) (Fig. 7).In the case of the River Findhorn in UK, the official gauged discharge for the 1970 flood was initially 60 % higher than the reconstructed peak flow for the "Muckle spate" flood of 1829, although later the 1970 peak flow was recalculated below the 1829 flood (McEwen and Werritty, 2007).

Multi-proxy analysis of past hydrological extremes
Documentary archival data on floods in general are ubiquitous across Europe, although the computation of peak discharges depends on the availability of reliable epigraphic flood marks or thoroughly documented water-level descriptions.In most cases, these flood marks are located in urban settings with frequent changes of the river channel topography that increase uncertainty of the values obtained when computing flood discharge.The combination of historical and palaeoflood (sedimentary) flood data has been demonstrated to be a very effective tool for improving the catalogue of past flooding and reducing uncertainties on flood discharges (Thorndycraft et al., 2005).Palaeostage indicators from sedimentary records (slackwater flood deposits) are frequently preserved within bedrock-stable cross sections which are suitable settings for hydraulic estimation of flood discharges (Benito and O'Connor, 2013).Moreover, the age uncertainty of numerical dating (radiocarbon respectively optically stimulated luminescence methods) used in palaeoflood studies may be refined based on known documentary floods (Medialdea et al., 2014).The SPHERE project has revealed the complementary of palaeoflood and historical flood information (Benito and Thorndycraft, 2004) with major gain on the quality of past flood records in terms of time and discharge, as is demonstrated in the studies performed for the rivers Gardon (Naulet et al., 2005;Sheffer et al., 2008), Ardèche (Sheffer et al., 2003;Naulet et al., 2005), Llobregat (Thorndycraft et al., 2005), and Guadalentin (Benito et al., 2010;Medialdea et al., 2014).Recent palaeoflood reconstructions from floodplain sediments have analysed geochemical proxies from continuous alluvial records and investigate local documentary flood data to calibrate the palaeohydrological records (e.g.Swiss Alps, Schulte et al., 2015;River Severn in mid-Wales, Jones et al., 2012; River Rhine in the Netherlands, Toonen et al., 2015).Flood sediments accumulated on floodplain sinks (e.g.palaeomeanders and floodbasin environments) can be studied with high-resolution techniques (e.g.X-ray-scanned samples) to obtain continuous records of grain size and geochemical content (Zr / Ti, Zr / Rb and Sr / Ti) indicative of the detrital fraction deposited by floods (Schulte et al., 2015).The reconstructed palaeoflood magnitudes are obtained after calibrating their ages obtained by geochronological techniques (radiocarbon) with known historical events and normalising grain size and geochemical content, where the coarse tail of grain-size distribution is used to estimate peak flood discharges or severity indexes (Toonen et al., 2015).
New emerging palaeoflood archives from lake records show a great potential for synergy with documentary floods to complete regional records of extreme events to understand flood-climate relationships (e.g.Wilhelm et al., 2012;Wirth et al., 2013;Corella et al., 2014).For instance, in Montcortés Lake (north-eastern Spain), a varved sediment core accumulating since the 14th century contains detrital layers associated with intense rainfalls (> 80 mm day −1 ) recording higher storm frequency during AD 1347-1400 and AD 1844-1894; both periods coincide with severe floods from the nearby River Segre (Corella et al., 2014).In non-varved lake systems, palaeoflood stratigraphy can be compared to historically documented flood records, as a mean to improve the age-depth model of the stratigraphic log (Schillereff et al., 2014).Another group of palaeoflood techniques suitable to combine with documentary sources are those based on botanical and ecological evidence (Bodoque et al., 2014).The presence of lichens on boulders in river channels can be used to date the flood responsible of their transport, once a lichenometric growth curve for the lichen species for the area of study has been established (Foulds et al., 2014).Dendro-geomorphology uses information from flood damages in trees and bushes, dating floods at an annual scale (Bodoque et al., 2014).Commonly, these palaeoflood methods are most suitable for mountain stream environments, where documentary sources provide a mean to establish the age biases to minimise errors during the calibration process.

Flood magnitude sensitivity to climate change
Climate variability may affect both flood frequency and magnitude with greater sensitivity on largest "rare" floods (50year flood and higher) than on smaller frequent floods (2year floods; Knox, 1993Knox, , 2000)).The study of historical floods in the context of climate variability has been focussed on high-quality complete data sets classified according to sever-ity of damage (Sturm et al., 2001) to infer changes in flood frequency, meteorological causes and seasonality (Glaser et al., 2010).The classification of historical floods according to peak discharge or discharges over some threshold allows further analysis on the sensitivity of flood frequency in relation to their magnitude.Furthermore, this classification based on discharge classes allows consideration of the most recent instrumental records in an integrated analysis avoiding the bias of classifications based only on flood damages which varied over time with regards to exposition and vulnerability.Based on literature sources (cf.for details below), eight records compiled from different European rivers where numerical or categorical flood magnitude during the historical period was completed with comparable data from gauged records (Fig. 8).Two flood categories were differentiated: (1) catastrophic floods (CAT) associated with high flood discharge or severe damages, and (2) extraordinary floods (EXT) causing inundation of the floodplain with moderate-to-minor damages.The detected flood changes are highly dependant on the observational window (Hall et al., 2014) with identification of flood-rich and flood-poor periods over the historical record and flood trend detection over the instrumental period.
In central Spain, increased flood frequency of large floods was identified in AD 1000-1200, 1525-1625 and in the late 19th-early 20th centuries (Benito et al., 2003a;Fig. 8a).During the second half of the 20th century, the frequency of floods decreased, in connection with a dominant positive mode of the North Atlantic Oscillation during winter months; however, flow regulation by dams also played an important role in this flood frequency decline since the mid-1950s.The decreasing trend in annual maximum floods was also detected on the flood analysis from gauge records of a set of rivers within the Tagus River basin under quasi-natural flow conditions (Mediero et al., 2014).In the River Segura (southeastern Spain) the frequency of catastrophic (autumn) floods decreased since the late 19th century together with the frequency of intense rainfall events except some decades (e.g. the 1970s and 1980s), in which intense rainfall and flooding co-existed with severe drought conditions (Fig. 8b; Machado et al., 2011).In the River Gardon (southern France), the frequency pattern of large floods (> 50-year floods) has decreased since the late 19th century, whereas the extraordinary and ordinary floods increased during the 20th century (Fig. 8c) (Sheffer et al., 2008;Neppel et al., 2010).Similarly, historical flood series from north-eastern Spain indicate a lack of statistical significant trend for large catastrophic floods, whereas extraordinary floods show a significant rise, especially since 1850 (Barrera-Escoda and Llasat, 2015).
Several types of meteorological events and different storm types result in mixed flood distributions, each characterised by individual probability distribution parameters (Hirschboeck et al., 2000).Climatic variability can lead to flood magnitude/frequency changes affecting one or various types of flood populations (e.g.early spring snowmelt, convective storms) with relevant implications in the non-Figure 9. Non-stationary model of the "100-year" flood over the last 300 years based on the dependence of the distribution parameters with the associated external covariates (winter NAO index and reservoir index).The horizontal line represents the 100-year flood from a log-normal distribution using documentary and instrumental records under a stationarity assumption (after Machado et al., 2015).stationarity of the statistical parameters supporting flood probability analysis (Milly et al., 2008).Documentary records provide information on prevailing circulation types producing floods based on changes in flood seasonality.Macdonald (2012), studying the River Ouse, a large catchment within a UK perspective, has identified a higher frequency of summer floods within AD 1700-1849 than in the AD 1850-1999 period.Furthermore, the combined documentary and instrumental flood record (Macdonald and Black, 2010) illustrates that the frequency of extraordinary floods within the range of 350 to 500 m 3 s −1 has increased during the 20th century, in particular comparatively to the most extreme floods (> 500 m 3 s −1 ; Fig. 8e).
In central Europe, long records of the Elbe and Oder/Odra rivers showed a decrease in winter floods during the last 80 to 150 years (Fig. 8g, h), while summer floods showed no significant trend (Mudelsee et al., 2003).This change in seasonal flood patterns is reflected in the recent trend towards an overall decrease in flood magnitude, although in the case of the River Vltava (Czech Republic) the August 2002 flood reached the highest peak flow on record (Fig. 8f; Brázdil et al., 2005a).In the River Rhine at Basel (Switzerland) severe summer (JJA) floods were particularly frequent between 1651 and 1750, in relation to enhanced precipitation; severe winter (DJF) floods have not occurred since the late 19th century, despite a significant increase in winter precipitation (Wetter et al., 2011).
In some regions, the potential for ice jams on rivers should be considered in the analysis and interpretation of winter peak flows along centennial records.An ice jam can generate water levels above rainfall floods due to inundation of the area behind the ice blockage, or as a consequence of rapid release of water after the ice jam failure (Beltaos, 2008).During the AD 1550-1850 period, ice cover on large mainland European rivers combined with late winter and spring snowmelt generated very large floods, similar to what is observed today at higher latitudes.In the Netherlands, many floods over the 1750-1860 period were associated with ice jams, particularly on the River Waal (e.g. in 1781Waal (e.g. in , 1784Waal (e.g. in , 1799Waal (e.g. in , 1805Waal (e.g. in and 1809;;Driessen, 1994).Detailed records describing the winter ice jam floods in 1784 are widely recorded across much of western and central mainland Europe (Demarée, 2006;Brázdil et al., 2010Brázdil et al., , 2012) (Fig. 5).In the River Mosel, the 28 February 1784 flood water level was significantly higher than any other recorded during the past millennium (Sartor et al., 2010), although any discharge estimation should consider that ice jams can raise water levels to much higher elevations than open-water floods (Beltaos, 2008).Other factors enhancing flood severity through time include timing of melting of glaciers (Debret et al., 2010).Global warming is introducing changes in the spatial (latitudinal) and temporal (seasonal) distribution of flooding related to ice and snowmelt (Beltaos and Prowse, 2009).For example, changes in the hydrometerological conditions that generate flooding may enhance flood magnitude in Norway, due to an earlier onset of snowmelt related to flooding in the region (Hisdal et al., 2006), whereas in mainland Europe, flooding related to ice jams is now unlikely to occur (Kundzewicz et al., 2014).

Historical floods in a non-stationary hydrology
The comparative analysis of historical records at different catchments across Europe points to the fact that the temporal distribution of flood frequency is predominantly modulated by regional meteorological triggers (Glaser et al., 2010).In regions where floods are generated by several types of weather conditions, each flood population is composed by a probability distribution resulting in mixed distributions.Long-term climate variability may alter the seasonal weather patterns producing floods (summer, winter, snowmelt, etc.) and consequently the assumption of stationarity of the flood frequency distribution.Stationarity has been qualitatively described as the idea that natural systems oscillate within an unchanging envelope of variability (Milly et al., 2008).In the case of extreme events, secular records of historical floods show a temporal variability (clusters) fluctuating at a multidecadal timescale.However, the underlying driving factors causing past departures from stationarity are far from being random phenomena.The temporal changes in the trajectory and statistics of a variable may be linked to natural, lowfrequency variations of the atmospheric circulation, external forcings (solar cycles) or anthropogenic changes.Therefore, a detail characterisation of natural variability of past floods will facilitate the attribution and modelling of future variability due to nature and human impacts.The statistical pa-Hydrol.Earth Syst.Sci., 19, 3517-3539, 2015 www.hydrol-earth-syst-sci.net/19/3517/2015/ rameters may show increasing/decreasing changes that can be modelled (as a trend or smooth function) using time as covariate (Villarini et al., 2009), or they can be related to hydro-climatic covariates such as circulation indices (e.g.Pacific Decadal Oscillation -PDO, North Atlantic Oscillation -NAO, Arctic Oscillation -AO) characterising this lowfrequency climatic variability (López and Francés, 2013).The application of these non-stationary models to historical and palaeoflood hydrology requires a numerical characterisation of the occurrence rate (covariate) during the recorded period.Several studies have demonstrated the relationships between flood frequency and magnitude with circulation indices, such as NAO index (Salgueiro et al., 2013).The application of a non-stationary flood frequency analysis in a 300-year record with 32 documented floods (> 350 m 3 s −1 ) of the River Tagus was successful in modelling the fluctuations of flood quantiles (e.g."100-year flood") using the North Atlantic Oscillation index and a reservoir index as external covariates (Machado et al., 2015).This non-stationary modelling was based on Generalized Additive Models for Location, Scale and Shape parameters (GAMLSS; Rigby and Stasinopoulos, 2005) that described the temporal variation of statistical parameters (mean, variance) in probability distribution functions (Villarini et al., 2010;López and Francés, 2013).In this example, the non-stationary models show that the peak flood associated with a "100-year" flood (0.01 annual exceedance probability) may range between 4180 and 560 m 3 s −1 , whereas the same model under stationary conditions provided the best fitting results to a log-normal distribution, with a discharge of 1450 m 3 s −1 (Fig. 9).These results illustrate that under stationary statistics the risk assumed is much higher than the one established in the design of infrastructures.Moreover, concepts such as return period, design quantile (return level), and risk under non-stationary conditions should be changed when the annual probability changes every year (Obeysekera and Salas, 2014).

Concluding remarks
This paper presents a review of the scientific progress in the quantification of large historical floods since the early stages prior to the automatic hydrological stations.In the last two decades, new approaches have been developed to obtain continuous and reliable flood magnitude data sets from documentary records, their statistical analysis and temporal patterns, illustrating the strength, limitation and future prospects of various methods.Most early discharge computations were obtained at sites with known water elevation from flood marks or by extrapolation of a rating curve at sites with staff gauge.Only in the last decades have flood descriptions from rich documentary evidence been used to estimate discharges of floods exceeding a threshold of perception, extending the record of flood discharges up to several hundreds of years.The reconstruction of secular historical records of extreme floods is relevant to solve major scientific and engineering problems: (1) flood hazard assessment using FFA (data censored over thresholds of perception), and (2) quantification of the largest discharges in a given catchment as evidence for a deterministic approach in safety risk analysis of critical facilities (dams, bridges, power plants).The historical flood records have gained attention among hydrologists on the background of new statistical methods of FFA using non-systematic data and recently in the analysis of nonstationarity modelling.However, the use of historical flood records for flood hazard studies is still scarce in the practical realm.Documentary flood data can benefit from the combined use with palaeoflood records such as fluvial sediments, botanical and dendrochronological records, flood-produced detrital layers in lakes and marine records.In particular, fluvial sediments deposited in slackwater environments have been demonstrated as very efficient to be combined with documentary data sets to improve the flood frequency analysis of rare and extreme floods.Europe holds numerous, unexplored archives in relation to historical floods, their causes and the socio-economic impacts.There is a great opportunity to generate scientific knowledge about the largest and rarest floods reported through historical times and to use them to improve the social conscience and perception of natural risks.The presented paper is a significant contribution to historical hydrology in Europe (Brázdil et al., 2006b(Brázdil et al., , 2012)), extending its potential for quantification of past documentary-based floods in Europe.

Figure 1 .
Figure1.Sources of quantitative flood information.(a) Sketch of a cross section showing various flood level indicators from palaeofloods (sediments and damage on trees), and documentary-based floods (i.e.those able to cause damage or socio-economic disruption).For historical hydrology, only floods exceeding a flood level related to a perception threshold (X i ) over a period of n i years (n 1 > n 2 > n 3 ) are recorded.Palaeofloods from stratigraphic records are related to geomorphic thresholds.(b) Organization of historical and palaeoflood data, using the described thresholds (X i ), and multiple types of observations to support flood frequency analysis.K i corresponds to the number of flood peaks during the last n i years that exceeded the X i threshold but not the X i−1 threshold.Upper bound level (X u ) may be used to limit the maximum discharge.Data types: E: flood peak is known.LB: flood was bigger than X i which is known; UB: the upper flood level of known magnitude (X u ) was not exceeded over a certain time period.DB: flow level was within the interval given by X u and X i .(c) Data source characteristics, timing, stage information, and typical temporal framework of systematic (instrumental) and non-systematic data (palaeoflood and documentary evidence).Modified afterBenito and O'Connor (2013).

Figure 2 .
Figure 2. Stream flow measurements at gauge station Borgo a Mozzano (43 • 59 30.73 N; 10 • 33 10.04 E) in the River Serchio (Italy), probably taken in the 1920s or 1930s.The observer is placed in a box suspended on cable that moved along the cross section from which an old device is managed to measure the stream flow velocity (photo courtesy of Regione Toscana -Genio Civile di Bacino Toscana Nord e Servizio Idrologico Regionale).

Figure 3 .
Figure 3. Historical flood marks of the River Loire on the Wilson Bridge in Tours (France).The elevations of the flood marks after the Direction Regionale de l'Environment de l'Aménagement et du Logement (DREAL), Centre-Val de Loire (http://www.centre.developpement-durable.gouv.fr).Discharge values associated with the flood levels after Duband (1996).

Figure 4 .
Figure 4. (a) Upstream view of the River Tagus into the Huerta del Rey (King's Garden) during flooding on 6 March 1947 at Toledo (Spain).The peak discharge was reached at 10:00 and the picture was taken at 15:20 (by courtesy of Mariano García Bargueño).The water level at peak discharge was ca.1.5 m above the railway station ground level (main building at the centre of the photo).In the background the arrow points out the Galiana Palace, on the left bank of the Tagus River, just over 1 km from the old part of Toledo (the Galina Palace was built at the site of an earlier summer villa and Arab garden of Al-Mamun, king of the Taifa of Toledo in AD 1043-1075) (Benito et al., 2003a).(b) Rating curve of a cross section next to the upper picture obtained from step-back water calculations (HEC-RAS model) with the elevation of relevant historical flood evidence (flood marks and description of inundated sites).The largest historical floods occurred during the Medieval Climatic Anomaly and are followed by the 1876 and 1947 floods (Fernandez de Villalta et al., 2001).

Figure 5 .
Figure 5. (a) Pictorial representation of the River Vltava during the February 1784 flood, showing ice floes and woody debris accumulated at the Charles Bridge in Prague (copperplate by F. Erban, Museum of the City of Prague, catalogue no.125.387).The ice jams at the bridge caused flooding upstream resulting to the highest known water marks until the August 2002 flood (Brázdil et al., 2005a).(b) A flood hydrograph of the Vltava River in Prague at the Monastery of the Knights of Cross reconstructed from documentary data for 27 February-1 March 1784 with an estimated discharge rate of 4560 m 3 s −1(Brázdil et al., 2005a).

Figure 6 .
Figure 6.Stone brick with an inscription of the 1906 flood mark of the River Rhine in Koblenz.The block was originally placed on a railway bridge destroyed during World War II and later used for reconstruction of this building(Herget, 2012).This confirms the necessity to work only with the original position of flood marks and to use multiple documentary evidence to reconstruct flood levels.

Figure 7 .
Figure 7. (a) Major rivers and streams of Europe and studied sites with multiple historical flood discharge estimates.Numbers refers to places cited in Table 1.(b) Maximum specific discharge (l s −1 km 2 ) of the largest historical and instrumental floods recorded in the sites referred in Table 1.Southern Europe includes sites from Portugal, Spain, Italy and France (except the Dordogne, Isère, Loire, Rhone and Garonne rivers), and central-northern Europe the rest of the rivers.Lines are envelope boundaries of the largest specific discharges for these two data sets.Most of the historical specific discharges are above the instrumental ones, except some rivers in central-northern Europe.