Computational experiments on the 1962 and 1970 landslide events at Huascarán (Peru) with r.avaflow: Lessons learned for predictive mass flow simulations

a Institute of Applied Geology, University of Natural Resources and Life Sciences (BOKU), Peter-Jordan-Straße 82, 1190 Vienna, Austria b Geomorphological Systems and Risk Research, Department of Geography and Regional Research, University of Vienna, Universitätsstraße 7, 1010 Vienna, Austria c Department of Natural Hazards, Austrian Research Centre for Forests (BFW), Rennweg 1, 6020 Innsbruck, Austria d Department of Geography, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland e Department of Geophysics, University of Bonn, Meckenheimer Allee 176, 53115 Bonn, Germany


Introduction
High-mountain areas are often characterized by tightly coupled process domains. Gravitational mass movements starting at higher elevation may impact distant lower areas not only directly, but also through cascading effects sometimes involving water bodies or entrainment of basal material. The resulting complex mass movementssometimes even evolving into process chainshave led to major disasters in recorded history: the 1963 Vajont landslide and flood disaster in the Italian Alps (Genevois and Tecca, 2013);the 1962 and1970 Huascarán debris avalanches in the Cordillera Blanca, Peru (Evans et al., 2009); and the 2002 Kolka-Karmadon ice-rock avalanche in the Russian Caucasus (Huggel et al., 2005) are some notable examples. Complex mass flows and process chains often occur related to a changing cryosphere (Evans and Delaney, 2014;Haeberli and Whiteman, 2014). Climate-induced environmental changes such as glacier retreat, the associated formation of hanging glaciers, the exposure of steep, debuttressed rock or moraine walls, and the degradation of mountain permafrost facilitate the release of initial mass movements (Alean, 1985;Haeberli, 1992;Haeberli et al., 1997;Huggel et al., 2003Huggel et al., , 2010Huggel et al., , 2012Noetzli et al., 2006;Gruber and Haeberli, 2007;Harris et al., 2009;Ravanel and Deline, 2011;Krautblatter et al., 2013;Haeberli et al., 2017).
The anticipation of complex mass flows in high-mountain areasand therefore also hazard analysis approaches, such as mapping or zoninglargely involve computer simulations based on mass flow Geomorphology 322 (2018) 15-28 models. Single-phase flow models, considering either solid or fluid, or a mixture of both, have been theoretically developed during the past decades, and more recently have also been increasingly implemented in computer codes (Voellmy, 1955;Savage and Hutter, 1989;Iverson, 1997;Takahashi et al., 1992;Pitman et al., 2003;McDougall and Hungr, 2004;Pudasaini and Hutter, 2007;Christen et al., 2010).
Current key research challenges include the understanding and modelling of interaction processes at the boundary of mass flows, specifically entrainment along the flow path and flow transformations. The simulation of complex mass flows often includes cascading model chains, switching from one approach or tool to the next at each process boundary (Schneider et al., 2014;Somos-Valenzuela et al., 2016). Even though such a coupling may have some advantages, Worni et al. (2014) have emphasized the need for integrated concepts, including also the process boundaries and related transitions. To this aim, twophase flow models offer an important advantage by separately considering the solid and the fluid phase, but also the strong interactions between the phases. In the present work we use the computational tool r.avaflow (Mergili et al., 2017a), which employs an enhanced version of the Pudasaini (2012) two-phase flow model. Even though r.avaflowas well as other mass flow simulation toolslargely relies on physical parameters in theory, in practice it has to be optimized to bring the simulation results in line with the observation, i.e. to reach a satisfactory degree of empirical adequacy (Oreskes et al., 1994;Fischer et al., 2015;Mergili et al., 2018). Appropriate, welldocumented and well-understood case studies are therefore of great importance to test the suitability of such models and their computational implementation for complex real-world mass flow simulations, and to derive appropriate guiding parameter values which can be used for predictive simulations. In this sense, we employ the 1962 and 1970 events at Nevado Huascarán (Cordillera Blanca, Peru) for: 1. Evaluation of numerical reproduction of spatio-temporal evolution of flow heights, velocities, travel times and volumes of complex mass flow events using r.avaflow. 2. Analysis of the validity of optimized model parameters for predictive simulations of possible future events of a magnitude different from the event for which the parameters were optimized. The availability of two reference events with comparable characteristics, but different magnitudes, is the ideal precondition for cross-wise parameter optimization and for predictive simulations with the parameter sets optimized for the other event. 3. Identifying issues of parameter sensitivity with particular emphasis on possible key threshold effects. Testing the concept of the impact indicator index (Mergili et al., 2017a) to appropriately display and evaluate parameter uncertainty in the simulation results.
Thereby we rely on the detailed description and topographic reconstruction of both events, mainly based on Hofmann and Patzelt (1983) and Evans et al. (2009). The key points of the event documentation are outlined in Section 2. Then we introduce the simulation framework r.avaflow (Section 3). We outline the data used as well as the parameterization and simulation strategy (Section 4) before presenting (Section 5) and discussing (Section 6) our results. Finally, we conclude with the key messages of the work (Section 7).
2. The 1962 and 1970 Huascarán mass flow events 2.1. The Cordillera Blanca as the origin of complex mass flows Nevado Huascarán is the highest peak of the approx. 200 km-long, NW-SE-stretching Cordillera Blanca, Peru, the most heavily glaciated mountain range throughout the tropics. Geologically it is part of a batholith characterized by tonalite and granodiorite rocks (Cobbing, 1981). The area has been repeatedly affected by earthquakes in historic times, some but not all of which have caused catastrophic landslides (Lliboutry et al., 1977). Such landslides as well as glacial lake outburst floods (e.g. Emmer et al., 2016) have repeatedly evolved into complex mass flows or even process chains travelling over tens of kilometres and, as a consequence, badly affecting the communities in the densely populated Callejón de Huaylas valley west of the Cordillera Blanca. The present article focuses on two events that originated from the steep, glacier-capped west face of the North Peak of Nevado Huascarán (6654 m a.s.l.) in 1962 and in 1970 (Fig. 1).
Due to the outstanding physical characteristics of the 1970 event in particular, and the huge impact of both events on society, the Huascarán mass flow events have been subject of extensive research over the past decades and described in detail in various sources (e.g. Ghiglino Antunez, 1971;Plafker and Ericksen, 1978;Hofmann and Patzelt, 1983). Evans et al. (2009) have summarized all the available literature and provide a comprehensive and detailed account of the characteristics of both events. Consequently, we only give a general overview, referring to Evans et al. (2009) and the references therein for further details. In particular, we revisit those characteristics of the events which are of especial importance for the computational experiments performed. Most importantly, we note that both events were complex in nature due to various flow transitions, excessive entrainment, and complex topographic conditions with overtopping of a N100 m-high ridge during the 1970 event.

The event of 10 January 1962
A rock-ice fall from the ice-capped west-facing slope of Huascarán North Peak occurred in the evening of 10 January 1962 without an apparent trigger. Due to the entrainment of snow from Glacier 511 beneath the rock face, and the uptake of primarily glacial deposits in the area beneath the glacier down to the Río Ranrahirca Valley (see Fig. 1), the movement evolved into a highly mobile debris-ice avalanche reaching the Río Santa and continuing further down as a debris flood. According to Evans et al. (2009), up to approx. 650 people were killed. Based on a combination of sources, the same authors suggest an initial volume of 1 million m 3 of rock and 2 million m 3 of ice, whereas 13 million m 3 of solid material (including ice) were deposited on the cone of Río Ranrahirca. Major transitions occurred (i) through the production of melt water from the initial ice content and the 1.8 million m 3 water equivalent of the entrained snow; and (ii) through the gain in volume by uptake of 12.3 million m 3 of glacial deposits, including water. Both transitions resulted in a drastic increase in the mobility of the landslide. Table 1 summarizes the volumes involved in the event. The velocities estimated for the event vary over a considerable range: 27 m s −1 was suggested for the terminal part of Glacier 511,   Evans et al. (2009). I = used as simulation input; X = used as reference for parameter optimization/empirical confirmation (see Table 3). The zones A-F are illustrated in Fig Avg. velocity to Río Santa (m s −1 ) 1 7 -37 e No data a Part of the ice is assumed to melt immediately after release (see Table 4). b Snow is assumed to melt immediately after it has been entrained. c Based on the assumption of 65% solid concentration (Evans et al., 2009 whereas the average velocity to Río Santa was given as 17 m s −1 and 37 m s −1 (see references in Evans et al., 2009).

The event of 31 May 1970
In the afternoon of 31 May 1970, an earthquake triggered another rock-ice fall from the west face of Huascarán North Peak (Fig. 2a, b), possibly preconditioned by mechanical destabilization through the 1962 event. The initial volume was much larger than in 1962: Evans et al. (2009) suggest 6.5 million m 3 of rock and 1 million m 3 of ice. Consequently, the energy of the event was higher and the entrainment of snow from Glacier 511 and of glacial deposits below was more intense (see Table 1). The transitions and associated increase in mobility inferred for the 1962 event also apply to the 1970 event in principle, however at a higher magnitude: 7.8 million m 3 of water equivalent were added from the surface of Glacier 511, whereas 53 million m 3 of glacial depositincluding its water contentwere entrained. In contrast to the 1962 event, the resulting debris-ice avalanche was not constrained to the valley of the Río Ranrahirca. The left part followed the Río Ranrahirca Valley, temporarily impounding the Río Santa. The subsequent debris flood reached the Pacific Ocean at a distance of 180 km. The right portion of the flow overtopped Cerro de Aira and impacted the town of Yungay (see Fig. 2a, c), leading to approx. 6000 fatalities (Evans et al., 2009; numbers vary among different sources). The material deposited in the areas of Ranrahirca and Yungay was described as fluid black slurry. Volumes involved and deposited in the various zones are summarized in Table 1. Fig. 3 illustrates the difference in the impact area between the events of 1962 and 1970. Based on various eyewitness reports, experiments, and literature analysis, the average flow velocity was deduced from travel times between the initial detachment and Yungay (Evans et al., 2009)   3. The simulation framework r.avaflow r.avaflow has been designed for simulating the propagation of various types of geomorphic mass flows. It represents a comprehensive GIS-based open source computational framework which, in contrast to most other related tools, offers a two-phase-flow model. It further considers entrainment of material along the path. These features facilitate the simulation of complex mass flows as well as of process chains and interactions. r.avaflow is described in full detail by Mergili et al. (2017a). Mainly those aspects relevant for the computational experiments conducted on the 1962 and 1970 Huascarán events are outlined in this section. The logical framework of r.avaflow is illustrated in Fig. 4. The codes, a user manual, a collection of test data and the associated start scripts are available at http://www.avaflow.org.
The Pudasaini (2012) two-phase mass flow model is used for simulating the propagation of mass flows from one or more pre-defined release areas down a mountain topography, represented by a Digital Terrain Model (DTM). Solid and fluid material can be entrained from the basal surface and incorporated into the flow. Arbitrary assemblages of solid and fluid release heights and entrainable heights may be defined. r.avaflow further relies on a set of flow parameters (see Mergili et al., 2017a for a summary of input parameters). The output of r. avaflow essentially consists in raster mapsmaximum and for all time stepsof solid and fluid flow heights H (m), velocities, pressures, kinetic energies, and entrained heights E (m). The evolution of the flow is computed through depth-averaged mass and momentum balance equations for the solid and the fluid components. The model equations and the physical concepts applied are described in detail by Pudasaini (2012). In the implementation of r.avaflow used in the present work, some model extensions are introduced. Among others, they include (i) ambient drag, accounting for air resistance (Kattel et al., 2016;Mergili et al., 2017a); and (ii) fluid friction, accounting for the influence of basal surface roughness on the fluid momentum. These extensions rely on the empirical coefficients C AD for ambient drag and C FF for fluid friction, respectively. Optional complementary functions include diffusion control, conservation of volume, surface control, entrainment, and stopping (Mergili et al., 2017a). Entrainment is computed through an empirical approach. Thereby, a user-defined entrainment coefficient C E (kg −1 ) is multiplied with the total (solid + fluid) momentum at each raster cell. Maximum entrainable solid and fluid heights can be defined as well. The solid ratio of the entrained material is always identical to the solid ratio of the basal material, assuming vertically uniform moisture patterns. Solid and fluid flow heights and momenta as well as the change of the basal topography are continuously updated. r.avaflow operates on the basis of GIS raster cells. The framework of model equations is solved through the high resolution Total Variation Diminishing Non-Oscillatory Central Differencing (TVD-NOC) Scheme introduced by Nessyahu and Tadmor (1990) and commonly applied to this type of mass flow problems (Tai et al., 2002;Wang et al., 2004).
The simulation outcomes are further evaluated against a userdefined observed impact area. Those GIS raster cells with observed mass flow impact are recorded as observed positives (OP), the ones without observed mass flow impact as observed negatives (ON). All cells with simulated mass flow impact are recorded as simulated positives (SP), those without simulated mass flow impact as simulated negatives (SN). In the present work we relate the true positive (TP), true negative (TN), false positive (FP) and false negative (FN) predictions in order to derive a set of performance indicators including critical success index (CSI), distance to perfect classification (D2PC), and factor of conservativeness (FoC) (Formetta et al., 2015;Mergili et al., 2017a; Table 2). Whereas CSI and D2PC indicate the degree of overlay between the simulated and the observed impact areas, FoC measures whether the simulation overestimates (conservative result) or underestimates the observed impact area (non-conservative result). Additional indicators are derived manually from the relation between simulated and observed volumes, travel times, or velocities. All performance indicators can be combined into the synthetic performance index (SPI) pointing out the overall empirical adequacy of a given simulation (see Table 2).  Table 2 Performance indicators used in the present study (partly based on Formetta et al., 2015;Mergili et al., 2017a). FoC, CSI, and D2PC are automatically derived within r.avaflow by overlay of the simulated and the observed impact area, whilst RVol and RT are derived manually. TP = number of true positive pixels; TN = number of true negative pixels; FP = number of false positive pixels; FN = number of false negative pixels; SP = number of simulated positive pixels; OP = number of observed positive pixels; V SIM = simulated volume; V OBS = observed volume; t SIM = simulated travel time; t OBS = observed travel time. The sum of the weights w 1 -w 5 always has to be 1.

Name Definition
Possible range Optimum Contribution to SPI Factor of conservativeness (FoC)

Computational experiments
Five computational experiments are conducted and summarized in Table 3. The availability of two reference events, different in magnitude but sharing some major characteristics, opens the opportunity to learn about the general applicability of optimized parameter sets. The experiments R1962 and R1970 aim to reproduce each of the events in an empirically adequate waybased on iterative optimization of the key parameters for each of the eventswhereas the experiments F1962 and F1970 explore the potential of the optimized parameter sets for predictive simulations, each using the parameter sets optimized for the other event. Separate parameter optimization procedures for each of the events are necessary because parameters are not only influenced by the characteristics of the surface, but also by those of the flow itself (McDougall and Hungr, 2005;Pudasaini et al., 2005;Pudasaini and Miller, 2013), and because the characteristics of the basal surface were possibly altered by the first event. The experiment S1970 elaborates on the sensitivity of the model outcomes to selected key parameters and serves for evaluating a strategy of dealing with parameter uncertainty. We only analyze the part of the mass flows starting from the release of the initial rock-ice falls down to Río Santa. Neither triggering of the events nor the distal debris floods are subjects of the present study.
We use a TanDEM-X Digital Elevation Model (DEM) from 2013 (after the 1962 and 1970 events). The DEM is available with a 12 m posting from TanDEM-X acquisitions performed along ascending and descending orbits on 10 October 2013 and 24 January 2013, combined in order to minimize issues of layover and shadowing (Wegmüller et al., 2014). The DEM is corrected for release and entrainment that has occurred during the events ( Fig. 5; see Section 2). Thereby we follow, as far as possible, the findings of Evans et al. (2009) who have compiled all available pieces of information on the 1962 and 1970 mass flows, complemented by their own investigations. The DEM is not corrected for deposition, as the spatial patterns of deposition are hardly known and deposition is not supposed to have changed the terrain patterns in a way that significantly affects the simulation results. Effects of the dam-break flood or subsequent changes in the topography are neglected as well, whereas some local artefacts clearly associated to errors in the DEM acquisition are removed. No corrections for vegetation or buildings are applied, meaning that the corrected DEM is directly used as a Digital Terrain Model (DTM). The investigation area is split into seven zones A-G. Each zone represents an area with particular geomorphological characteristics (Fig. 5 and Table 4). Also the initial conditions applied to the simulations are summarized in Table 4. They are kept constant throughout the respective computational experiments, following the information provided by Evans et al. (2009) (see   Mergili et al. (2017a). The internal friction angle is globally set to φ = 35°, whereas the solid density is adapted to the ice content of the mass flows (see Table 4).
We note that r.avaflow is currently not able (i) to adequately simulate the transition from solid to fluid material; and (ii) to consider rock and ice separately with different material densities. Therefore, entrainable snow is considered fluid from the beginning, and so is a certain fraction (2/3) of the ice. A physically better founded representation of the initial phase of the events would require an extension of r.avaflow and the underlying flow model. Such an extension could build on the rock-ice avalanche model introduced by Pudasaini and Krautblatter (2014).
The basal friction angle δ, the ambient drag coefficient C AD , and the fluid friction coefficient C FF are varied among the zones A-G. Preliminary simulations have suggested that the results in terms of the travel distance, impact area, travel time, and volumes deposited in the various zones are sensitive to the volume involved, and to the friction parameters. Consequently, parameter optimizationin terms of empirical adequacybuilds on variations in the spatial patterns of the following parameters: • Entrainment coefficient C E : C E is an empirical coefficient which is multiplied with the momentum of the flow in order to derive the entrainment rate at a given raster cell and time step (Mergili et al., 2017a). Variations in C E govern the spatio-temporal evolution of the flow. The simulated volumes are subject to equifinality (Beven, 1996;Beven and Freer, 2001) with regard to mixed effects of C E and the release volumes, which are kept constant for each of the two events. • Basal friction angle δ: variations in δ serve as a surrogate for variations in all the involved friction parameters, causing potential equifinality issues with φ, C AD , and C FF . C AD and C FF are varied among the zones A-G, but not among the computational experiments.
Parameter optimization is performed through an iterative approximation, largely starting from the parameters suggested by Mergili et al. (2017a), and then adjusting the parameters in a trial-and-error procedure until empirical adequacy is reached. All complementary functions except for stopping are enabled. The flow heights at the end of each simulation (t = 900 s) are considered as deposited heights. At this point of time the dynamics of the simulated flows is negligible. Only heights ≥0.25 m are considered for the evaluation of the simulation results.
The empirical adequacy of the outcomes of all computational experiments is quantified through the performance indicators summarized in Table 2, with the purpose of parameter optimization in the experiments R1962 and R1970, and evaluation of the results in the experiments F1962 and F1970. The observed impact areas, the volumes deposited in zones E and F, and the travel times (partly derived from average velocities inferred by Evans et al., 2009) are used for reference (see Table 1 and Fig. 5). Thereby, all relevant performance parameters receive the same weight for deriving SPI (see Table 2). Two values of RVol, and therefore also for w 4 and SPI 4 (for zones E and F) are considered, so that the number of performance parameters is six, and the weights w 1 -w 5 take values of 1/6 each.
Due to the large spatial extent of both events, a cell size of 12 m (the TanDEM-X cell size) is considered sufficient to capture the key patterns of the mass flows, and is used throughout all simulations except for the experiment 1970S, where a cell size of 24 m is applied.

Experiment R1962: back-calculation of the 1962 event
Parameter optimization is performed in order to achieve a high correspondence between the key output parameters impact area, volumes deposited in zones E and F, and travel time to Río Santa; and the reference observations (see Table 1).
The optimized parameter set (values of δ, C AD , and C FF for each of the zones A-G) is provided in Table 4. Fig. 6a shows the spatial patterns of Table 4 Zones into which the investigation area is split, and the initial conditions and empirically adequate sets of flow parameters for each zone (see No entrainment; not considered for evaluation of empirical adequacy a Optimized parameters (computational experiments R1962 and R1970); where two comma-separated values are given, the first value refers to R1962 and the second value toR1970. b 1/3 of the ice is assumed to remain in a solid state throughout the flow, whereas 2/3 of it is assumed to meltthe solid density is set to 2600 kg m −3 for the 1962 event, and to 2700 kg m −3 for the 1970 event; no entrainment is foreseen in Zone A. c Snow and ice entrained from the glacier surface is assumed to melt immediately when incorporated into the flow. This is induced by rapid fragmentation of the ice and by shear heating that occurs internally or along the basal surface (Pudasaini and Krautblatter, 2014). d Entrainmentbased on the initial conditions and C E for Zone Cis also allowed in a small area in the upper part of Zone D (see Fig. 5).
the maximum flow height and the total height of basal entrainment predicted for the 1962 event, whereas Fig. 7 summarizes the travel times and frontal velocities. The initial ice-rock avalanche rapidly gains velocity and moves over Glacier 511 at frontal velocities partly exceeding 50 m s −1 , thereby entraining fluid material. The flow enters Zone C after approx. 60 s and starts entraining the glacial deposits exposed there. Frontal velocities in this zone are in the range between 20 m s −1 and 50 m s −1 . Lateral spreading reaches beyond the observed extent. Such behaviour particularly concerns the tongues developing at the (left) southern side of the landslide, and the sharp bend before t ≈ 160 s. Here, a minor portion of the simulated flow overtops a ridge and enters the Llanganuco Valley, whereas the main landslide follows the Shacsha Valley. From this point, the simulated impact area corresponds reasonably well to the observation in its spatial extent. Most notably, at t ≈ 220 s the flow runs up the valley slopes in a way very similar to the observation. The apex of the Ranrahirca debris cone is reached after approx. 330 s, and Río Santa after t ≈ 480 s. At this stage, the flow velocity decreases from N20 m s −1 to b5 m s −1 . A total of 8.7 million m 3 of basal material is entrained, leading to a final flow volume of 11.7 million m 3 , most of which is deposited in the Zone E (Table 5). Whereas this value is lower than the uncertain value of 13 million m 3 reported by Evans et al. (2009) and given in Table 1, the simulation leads to an overestimate of the impact area. Consequently, the optimized parameter set (see Table 4) is considered a compromise between approximating the observed impact area and deposited volume. The simulated solid content of the deposit shows values around 65%. Table 5 summarizes the performance indicators associated with the experiment R1962. The synthetic performance index (SPI) is in the range 0.82-0.87 (depending on the travel time down to the Río Santa used for reference).

Experiment R1970: back-calculation of the 1970 event
As with experiment R1962, parameter optimization was also performed for the 1970 event (see Table 4 for a summary of the optimized values of δ, C AD , and C FF for each zone). The spatial distribution of the maximum flow height and the total height of basal entrainment derived with the optimized parameter set is illustrated in Fig. 6b, whereas Fig. 8 summarizes the predicted travel times and frontal velocities. As in the 1962 event, the initial rock-ice avalanche reaches maximum frontal velocities N50 m s −1 when travelling over Glacier 511. Fluid material is entrained, whereas the observed deposition in Zone B is not at all matched by the simulation results (see Table 1). Entrainment of glacial deposits occurs from t ≈ 50 s onwards. The simulated extent of lateral spreading in Zone C is in reasonable correspondence with the observation. However, the simulated lateral branches of the flow proceeding in the southward direction were not at all observed; and some patches within the observed impact area in Zone C are not impacted according to the simulation. The flow enters Zone D at t ≈ 160 s (notably, half a minute earlier than in the experiment R1962). It proceeds down the Shacsha Valley and starts overtopping Cerro de Aira at t ≈ 220 s, from where it splits into two branches (profiles B and C in Fig. 8b). The left branch continues down the valley and, from approx. 260 s onwards, spreads over the Ranrahirca debris cone down to Río Santa which it reaches after t ≈ 360 s. As prescribed by the optimization, the spatial extent of the flow, except for the distal part of the debris cone where the flow transformed to a debris flood (not considered in the present work), corresponds reasonably well to the observation. This correspondence is also found in the volume deposited in Zone E, accumulating to 45.2 million m 3 (see Tables 1, 5).
The branch having overtopped Cerro de Aira reaches the town of Yungay at t ≈ 310 s, a longer time than the 180-270 s inferred through reports and field evidence. The complex spatial patterns observed for the Yungay branch are effectively reproduced by the simulation in principle, though not in detail. Most notably, the fact that the cemetery hill was spared by the flow is also reflected in the simulation results. The predicted flow volume deposited in Zone F sums up to 3.4 million m 3 , corresponding reasonably well to the observation, considering the associated uncertainties (see Tables 1, 5). The simulated solid content of the deposit mostly shows values around 75%. The predicted flow velocities on the Ranrahirca cone and the Yungay branch largely range between 20 m s −1 and 50 m s −1 , only dropping below 20 m s −1 close to termination of the flow. The performance indicators derived from the experiment R1970 are summarized in Table 5, with SPI = 0.85-0.88 depending on the travel time to Yungay used for reference.

Experiments F1962 and F1970: predictive simulations with switched parameter sets
Applying the parameter set optimized for the 1970 event (see Table 4) to the 1962 event leads to a significantly poorer performance than achieved in the experiment R1962. SPI ranges between 0.45 and 0.51 (see Table 5). As can be seen in Fig. 9a, overtopping of Cerro de Aira is predicted, a behaviour not corresponding to the observation for the 1962 event. Analogously, applying the parameter set optimized for the 1962 event to the 1970 event leads to a simulation failing to reproduce the observed overtopping of Cerro de Aira (see Fig. 9b). As a consequence, this experiment is associated to a lower synthetic performance index than the experiment R1970 (SPI = 0.43; see Table 5 for a summary of all performance indicators). The factor of conservativeness clearly reflects the over-prediction in case of the 1962 event (FOC = 3.15) and the under-prediction in case of the 1970 event (FOC = 0.61). These outcomes are closely related to the combined effect of C Eacting as a surrogate for the increase in the flow volume - and δ D in determining the energy of a mass flow, and therefore its ability to overtop a N100 m-high ridge.

Experiment S1970: sensitivity to basal friction and entrainment coefficient
As a follow-up to the findings of the experiments 1962F and 1970F, we systematically elaborate the influence of C E and δ D on selected simulation results, considering the total deposition volumes in zones E and F (Fig. 10). Thereby we keep in mind that C E is used as a surrogate for the volume involved, and δ D serves as a surrogate for the friction parameters needed as input for r.avaflow (see Section 4). The ranges considered in terms of C E and δ D are chosen in a way to at least fully cover the parameter space between the optimized values for the 1962 and 1970 events, and even going beyond -particularly with C E . Even though the choice of the parameter space has to be considered as arbitrary in its details, widening or slightly narrowing the space would not change the key message conveyed through Fig. 10. The initial conditions of the 1970 event are applied (see Table 4). All simulations are run at a cell size of 24 m instead of 12 m in order to optimize the computational efficiency, and to explore possible effects of the spatial resolution.
We now focus on the deposition volume in zones E (V DE , debris cone of Ranrahirca) and F (V DF , Yungay branch). V DE depends on variations of both parameters in a largely linear way, without visible threshold effects (see Fig. 10a). A clear increase of the volume is observed when C E is increased, whereas the volume tends to increase slightly when δ D is increased. The increase in volume with increasing C E can be explained with the higher total volume of the flow. The increase with increasing δ D is strongly associated with the patterns shown in Fig. 10b: overtopping of Cerro de Airaand therefore V DFresponds in a highly nonlinear way to variations of both model parameters under investigation. Whereas no overtopping at all is predicted with lower values of C E and higher values of δ D , a more or less linear increase in volume is observed above a threshold line in the C E -δ D -diagram, corresponding to the "minimum requirements" for overtopping. One consequence of the response of V DF to δ Dwhich is stronger than the response to C E within the considered parameter spaceconsists in the slight depletion of V DE with decreasing δ D . Fig. 10c illustrates the spatial patterns of the impact indicator index (III) derived from 25 simulations with the same combinations of C E and δ D as shown in Fig. 10a and b. III is the fraction of simulations predicting an impact (in the present study, defined as H ≥ 0.25 m) at a given computational cell (Mergili et al., 2017a). In the upper part of the flow down to Zone D, III does not react in a way that is sensitive to variations in the parameters under investigation. This is not surprising as δ is only varied in Zone D. Also the effects of an increased value of C E are relatively

Table 5
Key simulation results and performance parameters for the first four computational experiments. V DE = volume deposited in Zone E, V DF = Volume deposited in Zone F, t Y = travel time to Yungay, t S = travel time to Río Santa. Refer to Table 1 for the reference  values; to Table 2 for the description of the performance indicators FoC, CSI, D2PC, RVol (the subscripts E and F refer to the considered zone), RT, and SPI; and to Table 3  moderate in zone C. In zones D-F the spatial dimension of the effects illustrated in Fig. 10a and b becomes evident: (i) decreasing values of III in the lateral portions of the flow in zones D and E; and (ii) a sharp decrease of III at Cerro de Aira, reflecting the threshold effect of overtopping. Consequently, the values of III are much lower in Zone F than in Zone E. With none of the parameter combinations explored in experiment 1970S, the southward "outbursts" in Zone C, which are very evident in the experiment 1970R, are observed (see Figs. 8,10c). Instead, the southernmost portion of the observed impact area in Zone C is under-predicted by the simulation. This is possibly an effect of cell size (12 m in the experiment 1970R vs. 24 m in the experiment 1970S).

Discussion
Multi-hazard phenomena and situations recently have attracted increased scientific interest, both at a conceptual level (Kappes et al., 2012), andparticularly for complex mass flows and process chainsfor event back-calculations using computer simulations (Schneider et al., 2014;Somos-Valenzuela et al., 2016). Recent studies have identified and highlighted challenges related to limited physical understanding, limited capabilities of current simulation approaches at process boundaries, and uncertain surface topography and material parameters (Westoby et al., 2014;Schaub et al., 2016;Mergili et al., 2018). Worni et al. (2014) underline the need for combining the individual components of complex mass flows and process chains in one integrated model, instead of coupling different simulation approaches. Conventional mass flow simulation tools, even though representing valuable instruments to investigate single phase debris flows, rock avalanches, or snow avalanches, are not capable of capturing the flow dynamics resulting from multi-phase phenomena (phase separation etc.), changing solid and fluid fractions and their interactions.
The present study has clearly demonstrated the ability of r.avaflow to back-calculate complex mass flows in an empirically largely adequate way. In contrast to earlier approaches, r.avaflow supports a continuous simulation across the boundaries of the individual components of these complex events. While this is considered a major step forward, there is still potential to extend the physical basis of the simulation framework. This can be done by taking into account further flow models such as the rock-ice avalanche model of Pudasaini and Krautblatter (2014): this approach considers internal mass and momentum exchanges between the phases, lubrication, and fluidization. The associated internal and basal strength weakening provide a more realistic simulation, especially during the critical initial and propagation stages, and explain the exceptionally high and dynamically changing mobility of rock-ice avalanches. We have also identified some important limitations concerning the predictive application of the computational framework for purposes of hazard mapping and risk management. Even though the Huascarán mass flows are generally well documented (particularly the event of 1970), the observations used as reference are uncertain (Evans et al., 2009). Therefore, validation of the model outcomes in a strict sense is not possible. Instead, we rather evaluate their empirical adequacy in the sense of Oreskes et al. (1994).
The optimized parameter values are not necessarily physically meaningful, but should be considered as heuristically derived values leading to empirical adequacy (de Lima Neves Seefelder et al., 2017;Mergili et al., 2018). In theory, almost all of the parameters can be determined through experiments in the field or in the laboratoryin practice this is only feasible for many types of simple materials and conditions. This makes the application of optimization procedures necessary. This fact is illustrated by the low empirically adequate values of δ, compared to the angle of reach of the events of approx. 13-14°(see Table 4): δ applies only to the solid fraction and is governed by a complex interplay of the flow and basal material as well as by the dynamics of the flow, which are impossible to measure for events of this magnitude. Further, the effects of variations in δ on the simulation results mix with the effects of other parameters: results are susceptible to equifinality in the sense of Beven (1996) and Beven and Freer (2001), particularly when considering only one or few reference parameters while optimizing multiple model parameters. In the present work, the model parameters are optimized in terms of empirical adequacy against various reference parameters, possibly reducing equifinality issues. Still, due to the large number of physically required input parameters of the two-phase mass flow model used in r.avaflow, only two were selected for sensitivity analysis. Thereby, C E serves as a surrogate for the entrained volume, whereas the effects of δ may be in part similar to those of C AD and C FF . The zonal variation of these three parameters (see Table 4) increases the complexity of the issue. Even though, in theory, a systematic multi-parameter sensitivity and optimization campaign would be preferable (Saltelli and Annoni, 2010) and potentially useful approaches to perform such an analysis (e.g. Fischer et al., 2015) can be coupled to r.avaflow, such an analysis requires specific strategies (also for reducing the computational time) and goes beyond the scope of the present work. Further extending the tool developed by Fischer et al. (2015) could enable systematic multi-parameter studies in the future, where also equifinality issues can be covered in a more quantitative way.
The empirically most adequate results do not necessarily coincide with a plausible spatial differentiation of the basal friction angle: on the one hand, such a spatial differentiation introduces equifinality also among the friction angles of the zones A-G, in addition to the equifinality issues among the various parameters. On the other hand, failure to reproduce the observation with a plausible parameterization may reflect a partly insufficient representation of the process by the simulation. It was necessary to use a substantially lower value of the basal friction in zone D to correctly simulate overtopping of Cerro de Airaa fact that may indicate that the complex flow behaviour, transitioning from an open slope into a narrow gully, and then partly overtopping a steep ridge, cannot be reproduced in full detail by the present simulations. These effects, which appear due to more or less abrupt changes in the topography, are also related to the spatial resolution (computational or DTM), including e.g. curvature related accelerations (Fischer et al., 2012). According to eyewitness reports, part of the material was actually not flowing but rather dispersed in the air, a behaviour currently not included in r.avaflow. The generally decreasing ambient drag coefficient represents an assumed tendency of the movement to gradually transform from falling and jumping to flowing, and thereby decreasing air resistance.
Having said that, the question of applicability of optimized parameter sets for predictive simulations arises. One could postulate that predictions of future mass flows can be conducted on the basis of back-calculated parameters of an earlier event in the same area, and of scenarios of the release volume and the type of material. However, the findings of the experiments F1962, F1970, and S1970 do not support such an approach. The lower value of δ in Zone D and the higher value of C E leading to the empirically most adequate results for the 1970 event, when compared to the parameter values for the 1962 event (see Table 4) may to some extent be explained by the fact that (i) the 1962 event might have led to some smoothing of the terrain and therefore decreased basal friction; and that (ii) the properties of much of the surficial material might have changed during the 1962 event. However, some other effects are thought to be more important: • The basal friction angle is not only governed by the characteristics of the surface, but also by those of the flow itself, including possible grain sorting and phase-separation (de Haas et al., 2015;Pudasaini and Fischer, 2016a), and so is C FF . The larger and more energy-rich 1970 flow may have evolved in a physically different way than the 1962 flow. • The approach used in r.avaflow to multiply C E with the momentum (Mergili et al., 2017a) might insufficiently capture the physical processes: otherwise the entrained volume would scale correctly with flow magnitude. The physically-based entrainment model of Pudasaini and Fischer (2016b) directly correlating the basal friction to the entrainment coefficient may produce better results. However, implementation of this advanced model is not the focus here as it requires more detailed scrutiny and investigation.
The effects described may influence the simulation results either in a linear way, with smooth and gradual responses of the simulation results to parameter variationsmainly in the case of "simple" processes such as the runout of a mass flow on a debris cone (see Fig. 10a)but they may also lead to critical threshold effects (see Fig. 10b). The issue of threshold effects in mass flow simulation was recently highlighted by Mergili et al. (2017bMergili et al. ( , 2018, employing generic and real-world cases of landslide impact into a lake as examples. In the context of the present study, the most obvious threshold effect is overtopping of Cerro de Airaobserved in 1970 but not in 1962. The computational experiment S1970 has clearly illustrated the high sensitivity of overtopping on two uncertain model input parameters. In terms of hazard mapping, simulation results based on a mis-estimate of the basal friction angle by only two degrees might put thousands of people at risk when used as the basis for spatial planning.
In summary, it has to be stated that computational experiments such as those conducted in this work help us to better understand the model, its application to real-world cases, and its behaviour with various parameterization scenarios. To a certain degree, the uncertainty in parameterization contains an epistemic component that can be reduced through the derivation of guiding parameter values for certain flow types and magnitudes. However, a large aleatoric component of parameter uncertainty will most probably remain, so that parameter sets back-calculated from past events are not necessarily the key to the future. Predictive simulations of complex mass flows require appropriate strategies in order to adequately deal with this issue. On the one hand, multiple scenarios can be considered (Schaub et al., 2016). On the other hand, ranges of initial conditions and flow parameters can be analyzed instead of fixed values. The result of this type of approach are likelihood measures such as the impact indicator index introduced by Mergili et al. (2015Mergili et al. ( , 2017a and applied in the computational experiment S1970. An important, yet challenging step consists in the identification of suitable parameter spaces that can be applied in such contexts (Krenn et al., 2016). Indices or probabilities implicitly account for the uncertainties in the results. Making the information provided less definite increases the chance of successful predictive simulations. Communicating uncertain simulation results to authorities represents a challenge on its own, with experiences available for example from flood risk management (Ramos et al., 2010;Demeritt et al., 2013;Pappenberger et al., 2013). More appropriate communication strategies would be essential to establish and maintain well-targeted management strategies of the risks associated to complex mass flows in the Cordillera Blanca and elsewhere in the world (Carey, 2005;Carey et al., 2012Carey et al., , 2014.

Conclusions
Representing well-documented events with similar release areas and characteristics, but varying in their magnitude and level of complexity, the 1962 and 1970 mass flows at Huascarán (Cordillera Blanca, Peru) may assist mass flow modelers to increase the understanding of key challenges they are facing in their research work. We have deduced the following key statements from a set of computational experiments on these two events with the GIS-based open source twophase mass flow simulation framework r.avaflow: 1. The key observations for each of the two events can be computationally reproduced at a reasonable degree of accuracy with tailor-made (optimized) sets of parameter values.
2. Running the simulations for the two events with switched parameter sets (set optimized for the 1962 event applied to the 1970 event and vice versa) leads to unsatisfactory results in both cases. Most notably, overtopping of Cerro de Aira, representing an important threshold effect, is not correctly predicted. 3. Systematically analyzing the effects of the basal friction angle in a specific zone and the entrainment coefficient reveals a high sensitivity of the key model outcomes to even small variations in the combination of the parameter values. A difference of thehighly uncertainvalue of the basal friction angle by just a few degrees may decide whether Cerro de Aira is overtopped or not, with all the consequences thereof.
Whereas these findings may help to improve some concepts employed in r.avaflow, they also underline some major general issues to be considered for predictions of mass flows conducted with the purpose to inform risk management: enhanced back-calculation efforts and parameter studies with a larger number of well-documented events of various types and magnitudes are required in order to allow for the anticipation of the key characteristics of future mass flow events in terms of likelihood measures or scenario analyses.