Earth and Planetary Science Letters Upper mantle temperature and the onset of extension and break-up in Afar, Africa

It is debated to what extent mantle plumes play a role in continental rifting and eventual break-up. Afar lies at the northern end of the largest and most active present-day continental rift, where the East African Rift forms a triple junction with the Red Sea and Gulf of Aden rifts. It has a history of plume activity yet recent studies have reached conﬂicting conclusions on whether a plume still contributes to current Afar tectonics. A geochemical study concluded that Afar is a mature hot rift with 80 km thick lithosphere, while seismic data have been interpreted to reﬂect the structure of a young, oceanic rift basin above mantle of normal temperature. We develop a self-consistent forward model of mantle ﬂow that incorporates melt generation and retention to test whether predictions of melt chemistry, melt volume and lithosphere–asthenosphere seismic structure can be reconciled with observations. The rare- earth element composition of maﬁc samples at the Erta Ale, Dabbahu and Asal magmatic segments can be used as both a thermometer and chronometer of the rifting process. Low seismic velocities require a lithosphere thinned to 50 km or less. A strong positive impedance contrast at 50 to 70 km below the rift seems linked to the melt zone, but is not reproduced by isotropic seismic velocity alone. Combined, the simplest interpretation is that mantle temperature below Afar is still elevated at 1450 ◦ C, rifting started around 22–23 Ma, and the lithosphere has thinned from 100 to 50 km to allow signiﬁcant decompressional melting.

It is debated to what extent mantle plumes play a role in continental rifting and eventual break-up. Afar lies at the northern end of the largest and most active present-day continental rift, where the East African Rift forms a triple junction with the Red Sea and Gulf of Aden rifts. It has a history of plume activity yet recent studies have reached conflicting conclusions on whether a plume still contributes to current Afar tectonics. A geochemical study concluded that Afar is a mature hot rift with 80 km thick lithosphere, while seismic data have been interpreted to reflect the structure of a young, oceanic rift basin above mantle of normal temperature. We develop a self-consistent forward model of mantle flow that incorporates melt generation and retention to test whether predictions of melt chemistry, melt volume and lithosphere-asthenosphere seismic structure can be reconciled with observations. The rareearth element composition of mafic samples at the Erta Ale, Dabbahu and Asal magmatic segments can be used as both a thermometer and chronometer of the rifting process. Low seismic velocities require a lithosphere thinned to 50 km or less. A strong positive impedance contrast at 50 to 70 km below the rift seems linked to the melt zone, but is not reproduced by isotropic seismic velocity alone. Combined, the simplest interpretation is that mantle temperature below Afar is still elevated at 1450 • C, rifting started around 22-23 Ma, and the lithosphere has thinned from 100 to 50 km to allow significant decompressional melting.

Introduction
The Afar region in northern Ethiopia forms the northern tip of the largest and most active present-day continental rift, where the Main Ethiopian Rift (MER) intersects the Red Sea Rift and the Gulf of Aden (Fig. 1). Beneath the Red Sea Rift, crust is thinned to about 15 km (Markis and Ginzburg, 1987;Hammond et al., 2011), volcanism is much more wide spread than in the MER, and much of the region has subsided below sea level in the past few million years, observations that point to the region being in transition from rifting to spreading (Markis and Ginzburg, 1987;Hayward and Ebinger, 1996;Bastow and Keir, 2011). 1350 • C. Seismic travel-time tomography  finds low seismic velocities under Afar consistent with such a shallow melt zone. At the other end of the debate, Ferguson et al. (2013) used the trace element composition of mafic lavas from Afar and petrogenetic modelling to argue that the erupted magmas are predominantly generated below a still thick lithosphere, at depths greater than 80 km, and at mantle potential temperatures close to 1450 • C. This temperature is more consistent with major element composition of lavas within the northern part of the East African Rift system, which suggest an elevated potential temperature of between 1370 and 1490 • C for rocks erupted in the last 10 Ma (Rooney et al., 2012b).
A central question is therefore whether there is still currently active rifting and the up-welling of deep and hot (>1400 • C) mantle material, which is possibly rooted in the African Superplume (e.g. Nyblade, 2011). The other possibility is that today Afar has evolved to the point of steady passive up-welling of the asthenosphere. Although estimates of the mantle potential temperature beneath Afar for these respective scenarios both lie between 1350 and 1490 • C, this range represents the difference between either minimal volcanism or the generation of a significant amount of melt (White and McKenzie, 1995;Armitage et al., 2010). Knowledge of how mantle temperature and lithospheric thinning have evolved in Afar is therefore essential to understanding which key tectonic and/or magmatic processes are driving the continued development of this rift system.
In this study we attempt to reconcile the seemingly contradictory geochemical and geophysical observations with a single tectonic scenario. We develop a set of models of lithospheric extension and mantle melting and compare the model predictions to observations of melt volume, melt chemistry, bulk seismic velocities and discontinuities of the lithosphere-asthenosphere. We focus on three regions of Afar where both seismic and geochemical constraints are available: (1) the Erta Ale rift zone, within the northern Danakil Depression, which is near the northwestern end of the main active rift, (2) the Dabbahu magmatic segment in central Afar, near the border of the rift zone, and (3) the Asal rift zone at the southeastern end of the Afar rift zone and the western edge of the Gulf of Aden (Fig. 1).

Methods
We use a relatively straightforward 2-D geodynamic model of extension of a viscous lithosphere-asthenosphere system and decompressional melting to explore the effect of mantle temperature on rift evolution. Modelling extension within the region of the Danakil Depression and Asal Rift as a 2-D process is reasonable given that extension is perpendicular to the rift axis (Fig. 1).

Geodynamic model
Evolution of upper mantle temperature and flow is modelled as Stokes flow driven by a divergent upper velocity boundary condition and a temperature difference between the surface and model base (Armitage et al., 2010). The most likely rheology for continental lithosphere and sublithospheric mantle is a temperature and pressure dependent dislocation creep (e.g. Lévy and Jaupart, 2011). We use a formulation that includes the effect of meltweakening and dehydration strengthening (see Supplementary Material; Armitage et al., 2010). Thermal state and non-Newtonian viscous flow are solved in the finite-element code CitCom (Moresi et al., 1996). We use a 2800 km wide by 700 km deep 2-D Cartesian domain containing 512 by 512 equally spaced nodes, providing a resolution of 5.6 by 1.4 km.
Solid-mantle density in the models changes as a function of temperature, melt retention, and melt depletion, providing buoy-ant flow due to melt generation. Melt production is calculated as a function of temperature, pressure and previous depletion (Scott, 1992;Phipps Morgan, 2001;Nielsen and Hopper, 2004;see Supplementary Material). Melt starts once the hydrous solidus is crossed, where up to 2% melt can be generated until temperature exceeds the dry solidus. We include a hydrous solidus as there is evidence of some water within the volcanics in Afar (e.g. Pik et al., 1999). Melt generated is assumed to be transported vertically by Darcy flow, using the methods described in Goes et al. (2012), for more explanation see the Supplementary Material.
Solid mantle-flow boundary conditions are a prescribed symmetric divergent-flow velocity condition on the top and free slip on the sides and base. Temperature boundary conditions are fixed temperature at the base (a mantle potential temperature of 1350, 1450 or 1550 • C) and the top (0 • C), and a zero normal temperature gradient at the sides. The initial condition is a 100, 150 or 200 km thick lithosphere defined by a linear increase in melt depletion (from 0% at the base to 50% at the surface) and reduction in temperature from the basal temperature condition at 100 km depth to 0 • C at the surface. The high buoyancy is to keep the high viscosity lithosphere at the top of the model domain. This material, due to its buoyancy, does not participate in the melting as it remains above the solidus and is moved to the sides due to the divergent boundary condition.

Melt chemistry
Partitioning of REEs between the solid mantle and partial melt is calculated from the melt depletion, temperature and pressure within the melt region assuming incremental batch melting (see Dean et al., 2008;Gibson and Geist, 2010;Armitage et al., 2011;and the Supplementary Material). Geochemical and isotopic evidence from erupted melts shows that mantle source beneath Afar is fertile compared to the depleted upper mantle source of MORBs (e.g. Schilling, 1973;Schilling et al., 1992;Barrat et al., 1998;Rooney et al., 2012a;Ferguson et al., 2013), and that this fertile source has been a long-lived feature of the mantle here (Pik et al., 1999). To examine how the extent and depths of melting evolve during rift development we calculate the REE concentration of melts generated by partial melting of mantle upwelling beneath the extending lithosphere. For the source we use a fertile mantle composition from McDonough and Sun (1995).

Conversion to seismic velocities
The models are converted to synthetic seismic structure (compressional velocity, V P , shear velocity, V S , density, and shear attenuation, Q S ) following the methods described in Goes et al. (2012). A thermodynamic formulation is used to determine elastic parameters and density as a function of temperature, pressure, composition, and phase, using the mineral parameter compilation of Xu et al. (2008), and the code PerPleX from Connolly (2005). For the fertile mantle we use a peridotite composition, and for the depleted mantle a harzburgite, both from Xu et al. (2008). Seismic velocities are not sensitive to more detailed variations in composition, so we linearly grade between these compositions as a function of degree of depletion (see Goes et al., 2012). Subsequently, we correct for shear attenuation using a semi-empirical temperature, pressure and dehydration dependent Q model, Q g in Goes et al. (2012). We assume that the reference mantle is damp, as estimated for an MORB-source (1000 H/10 6 Si; Hirth and Kohlstedt, 1996).
The presence of melt is in most models assumed to only affect the elastic response (Hammond and Humphreys, 2000a;Gribb and Cooper, 2000). We chose the melt derivatives for cuspate melt geometries from Hammond and Humphreys (2000b): a 7.9% reduction in V S per percent melt and a 3.6% reduction in V P per percent melt. These are the highest derivative estimates for melt fractions below 1%, commensurate with the low melt fractions predicted in most of the model. However, it should be noted that alternative melt geometries with a preferential orientation sufficient to cause significant seismic anisotropy may result in even larger velocity reductions (Blackman and Kendall, 1997). Some experimental studies indicate that the presence of melt might also effect attenuation within the seismic frequency range (Faul et al., 2004;McCarthy and Takei, 2011). To investigate the maximum effect that melt could have, we include in one of the models an additional effect on attenuation. We parameterised the effect simply by decreasing shear attenuation by half an order of magnitude (i.e., factor ∼3) per percent melt. This is on the lower end of what Faul et al. (2004) measured for wave periods between 1 and 100 s but stronger decreases give unrealistically low S-wave velocities (as low as 3.2 km s −1 ).

Synthetic S-to-P receiver functions
We calculated synthetic S-wave receiver functions for predicted 1-D velocity models beneath the rift, after adding a crustal velocity structure consistent with that determined from co-located P -wave receiver functions (thickness of between 20 and 30 km, V P = 6.2 kms −1 and V S = 3.3 kms −1 ; Hammond et al., 2011).
Synthetic seismograms were calculated using a propagator matrix method (Keith and Crampin, 1977). We assume a representative horizontal slowness of 0.106 s km −1 and dominant period of 11 s. The three component synthetic seismograms were rotated into theoretical P and S V components using a free-surface transformation matrix (Bostock, 1998;Cerveny, 2005). The S V waveforms were then deconvolved from P waveforms using a simultaneous deconvolution (Bostock, 1998;Rychert et al., 2007). Filtering (0.05 to 0.14 Hz) and water level deconvolution (amplitude of 2 × 10 5 ) were applied to the waveform to match parameters used in the modelling of data from the Afar triple junction (Rychert et al., 2012).

Model cases
To reconcile the different observations we test a range of mantle temperatures, different durations of rifting, a range of initial lithospheric thicknesses, and several melt productivity and retention parameters. The range of models that will be discussed is given in Table 1.
In most of the models we consider a two-phase extension history. Extension in the Afar region was initially slow until around 11 ± 2 Ma when extension shifted 100 km westwards and the southern Red Sea became a failed extensional basin. At this time, the rate of extension doubled to the current half spreading rates (Reilinger and McClusky, 2011). The onset of rifting in this region followed the major phase of volcanism that formed the Ethiopian flood basalts at ∼30 Ma (Hofmann et al., 1997) and is estimated to have begun between 21 and 29 Ma (Wolfenden et al., 2005;Reilinger and McClusky, 2011). Extension is imposed on the models as a divergent upper velocity boundary condition on either side of the rift axis. The half-rate is initially set to 3.5 mm yr −1 , and increases to 7.0 mm yr −1 after 11 Myr. The forward model is run for a duration of up to 35 Myr.
The thickness of the continental lithosphere prior to the formation of the East African Rift zone is not known. Phanerozoic continents are estimated to be between 80 and 120 km thick from surface heat flow (Artemieva and Mooney, 2001) and inferred from surface wave tomography for Africa away from the cratons and active rifts (Fishwick and Bastow, 2011). To explore the impact of the assumed initial lithosphere thickness, we first model simple extension at a half spreading rate of 7 mm yr −1 of a lithosphere that has a thermal and compositional thickness of 100, 150 and 200 km above a mantle with a potential temperature of 1450 • C (models L100, L150 and L200). For other models we use an ini-tial lithospheric thickness of 100 km which falls within the range expected for the region.
While melt volume and chemistry are sensitive to melt production history, seismic structure is sensitive to the amount of retained melt. As our reference value, we assume melt permeability 10 −10 m 2 . This gives a migration velocity of ∼cm yr −1 (Goes et al., 2012), which is relatively slow compared to U-series constraints on melt-extraction rates (e.g. Stracke et al., 2006). However, as some seismic constraints (Rychert et al., 2012;Stork et al., 2013) have been interpreted with relatively large fractions of retained mantle melt, we also tested a model with higher melt productivity, a model with an order of magnitude lower permeability (model 1450L), and a model where melt only becomes mobile once it exceeds a threshold of 1% as suggested for mantle melts by Faul (2001)

Results
The constraints that we have on Afar rifting are (1) an estimate of total melt volumes, (2) melt composition, (3) volumetric seismic velocities, (4) depths and contrasts of seismic discontinuities. Melt volumes and chemistry are sensitive to mantle temperature and the evolution of lithospheric thickness, i.e., initial lithospheric thickness and the history of rifting, and consequent evolution of mantle depletion. Volumetric seismic velocities give constraints on the thickness of the lithosphere and depth of the asthenosphere, while discontinuities may correspond to compositional or phase boundaries including the onset of dry melting, and the depth of dehydration.

Constraints on igneous addition to the crust
The thickness of the crust that may be due to magma emplacement during rifting can be estimated from receiver functions and active seismic transects. Thickness of the original unstretched crust inferred from the rift's western margins and in Yemen/Arabia is between 19 and 21 km for the upper and 18 to 23 km for the lower crust (Mechie et al., 1986;Markis and Ginzburg, 1987;Maguire et al., 2006). At Erta Ale, we find an upper crustal thickness of 2 and 1 km and a lower crust of 11 and 9 km from a near-by seismic station and wide-angle seismic survey (Markis and Ginzburg, 1987;Hammond et al., 2011). If we assume that the upper crust contains no intrusions of magmatic material, the stretch factor is between 9.5 and 21, suggesting that the lower crust is intruded with between 6 and 10 km of material. Upper and lower crustal thicknesses measured near Dabbahu are 4 km and 16 km, respectively (Hammond et al., 2011), i.e., a stretch factor between 4.8 and 5.3, implying an addition of 11 to 15 km of volcanic material. At Asal, thicknesses of 2 to 6 km for the upper and 14 to 15 km for the lower crust (Ruegg, 1975) imply a stretch factor of 3.2 to 10.5 and the addition of 9 to 14 km of volcanic material. This yields total volumes of melt added during rifting that correspond to 6-14 km of crustal thickness. Fig. 2 illustrates the influence of initial lithospheric thickness on melt production, for a spreading history at single constant spreading rate (models L100, L150, L200 in Table 1). The trend in mean melt fraction and igneous thickness shows an initial slow increase in melt production (Fig. 2), while melting only occurs deep and at temperatures below the dry solidus, where melt productivity is low. Once the lithosphere has sufficiently thinned to allow dry decompressional melting, productivity increases, causing the rapid rise in melt fraction, after 5, 15 and 23 Myr for 100, 150 and

Models: Effect of lithospheric thickness and mantle temperature
is the mantle density, u x = 7 mmyr −1 is the half spreading rate, ρ l = 2800 kg m −3 is the melt density and φ melt is the melt production rate.
(B) Mean melt fraction given by F mean = Fφ melt dxdz/ F dxdz, where F is the local melt fraction (Plank et al., 1995). Dashed lines are for hydrous-only melt production. The spikes in the trends for model L150 are due to convective instabilities nucleated at the edge of the thinned lithosphere. These instabilities increase the melt production rate φ melt , hence temporarily increasing crustal thickness. The mean melt fraction reduces at the same time as the zone of high melt production is at depth within regions of low melt fraction therefore biasing the mean melt fraction calculation.
200 km thick lithosphere, respectively. A 50 km increase in lithosphere thickness delays this transition by roughly 10 Myr. The delay time depends on spreading rate; for a half spreading rate of 20 mm yr −1 this delay reduces to 3 Myr and at 80 mm yr −1 it is less than 1 Myr (Armitage et al., 2009). For both mean melt fraction and igneous thickness, the only effect of an increased lithosphere thickness is a systematic delay in the increase in productivity (Fig. 2). Therefore, the evolution of melt composition is similarly shifted in time, but not otherwise affected by the initial lithosphere thickness.
Next we explore how asthenospheric temperature alters the melt productivity. We do this for the more realistic two-phase rifting scenario, to allow for a comparison of melt volumes with the estimated thickness of igneous crust and observed range of durations of rifting. We show results for the models with an initial lithosphere thickness of 100 km, i.e., assuming the present day surrounding lithosphere structure is representative of conditions prior to break-up.
In model 1350N, where the asthenospheric potential temperature is 1350 • C, up to 5 km of igneous crust is generated after 35 Myr of evolution (Fig. 3). The mean melt fraction is typically low, ranging from 0.01 at the onset of dry melting to 0.03. The maximum melt fraction reaches 0.08 after 35 Myr of model evolution. For model 1450N, where the potential temperature is 100 • C hotter, melt production is significantly increased and ig-  Table 1). The green-gray region shows the range of igneous thickness calculated from wide angle and receiver function studies (see Supplementary material). (B) Mean melt fraction and (C) maximum melt fraction for the three model cases.
(D) Lithosphere thickness taken as the depth to the 1250 • C isotherm. neous crustal thickness reaches 8 km after 21 Myr and 15 km after 35 Myr (Fig. 3). For the 1450N case, the mean melt fraction increases to 0.07 and the maximum melt fraction is 0.19. In model 1550N, where the mantle potential temperature is 1550 • C, melt production rates are very high and the igneous crustal thickness is greater than 15 km after 13 Myr and greater than 30 km after 18 Myr of model evolution (Fig. 3). Furthermore, mean melt fractions are high, up to 0.18, and the maximum melt fraction rapidly increases beyond 0.3. For all models, the depth to the base of the lithosphere, defined as the depth to the 1250 • C isotherm, is similar (Fig. 3D). This is because the thinning of the lithosphere is primarily controlled by the extension, which is at the same rate for all of the models.

Comparison with igneous crustal thickness
In comparison to the observed igneous crustal thickness of between 6 and 14 km, a mantle potential temperature of 1550 • C is too productive and hence too hot to match crustal thickness given the age of the rift zone (Fig. 3). For the case of hydrous melting, a mantle potential temperature of 1350 • C is on the cold side, as it can only approach the lowest estimate for the thickness of intruded melt after the maximum time of extension (Fig. 3). The 1450 • C model predicts 6-12 km of igneous crust after 20 to 29 Ma of rifting, which is within the range of rift durations estimated from observations. For the case of only anhydrous melting a mantle potential temperature of 1350 • C creates less than 6 km, while at 1450 • C the model predicts a thickness >6 km after 18 Myr (see There is a trade off between the temperature of the asthenosphere and the assumed initial lithosphere thickness. For temperatures lower than 1350 • C, an initial lithosphere thickness less than 100 km could produce a reasonable age-igneous thickness trend. Likewise if temperatures are higher than 1450 • C, a lithosphere that is thicker than 100 km would allow matching rift duration and igneous crustal thickness. However, as we will show in the comparison with seismic velocities in Section 3.3, an initial thickness of around 100 km, similar to other Phanerozoic lithosphere (e.g. Artemieva and Mooney, 2001), is most consistent with the range of observations.

Melt composition
We will now explore how the two preferred models from the previous section, models 1350N and 1450N, create different temporal evolutions in melt chemistry that allows us to better constrain rift duration.

Observations: REE trends
We compare predicted melt compositions to the REE chemistry of late-Pleistocene samples from the Erta Ale (Barrat et al., 1998), Dabbahu (Ferguson et al., 2013) and Asal (Pinzuti et al., 2013;Schilling et al., 1992) magmatic rift zones. These were corrected for crystal fractionation by the incremental addition of olivine, until melts were obtained that are in equilibrium with mantle olivine compositions of Fo 90 . This was done using an Fe 3+ / Fe ratio of 0.16 (Ferguson et al., 2013) and an Fe-Mg partition coefficient (Kd = (Fe 2+ /Mg) olv /(Fe 2+ /Mg) melt ) of 0.3. For each zone, the data display a clear decrease in source normalised concentration from middle to heavy REE. The ratio of La to Yb for the rock samples from all three zones have a mean of 5.52 and a standard deviation of 2.26. For Dy/Yb the ratio is 2.02 ± 0.15 (Fig. 4).

Models: Effect of mantle temperature and rift duration
During early extension low degree melts are generated during hydrous melting at high pressures, where garnet is stable in the mantle. During this early period, the most incompatible elements, Fig. 4. Model trend in the ratio of La to Yb and Dy to Yb against time as the rift evolves with extension at a half rate of 7 mmyr −1 after an initial period of 11 Myr of extension at a half spreading rate of 3.5 mm yr −1 . The model is compared to the observed compositions from the Erta Ale (Barrat et al., 1998), Dabbahu (Ferguson et al., 2013) and Asal magmatic segments (Pinzuti et al., 2013;Schilling et al., 1992). Dashed lines are for hydrous-only melt production. (A) Ratio of Dy to Yb plotted against time. The ratio decreases as more productive dry melting begins, and as the degree of melting increases above the garnet-in transition. such as La and Ce, enter the melt leaving a depleted source. Compatible elements, such as Dy and Yb, predominately remain within the solid mantle matrix during early melt production.
For a fertile mantle with no previous depletion, and a mantle potential temperature of 1350 • C, the wet solidus is crossed at a depth of 120 km and the dry solidus at 55 km; for a mantle potential temperature of 1450 • C, the wet solidus is crossed at 150 km and the dry solidus is at 80 km depth (see Eqs. (4) to (6) in the Supplementary Material). The spinel-out boundary is between 80 and 85 km at 1350 and 1450 • C ( McKenzie and O'Nions, 1991). With these parameters the La/Yb ratio decreases earlier than the Dy/Yb ratio, due to the incompatible elements having partition coefficients that are two orders of magnitude smaller in the garnet stability field (Fig. 4). From the plot of Dy/Yb against time we can see that, if the mantle temperature is 1450 • C, productive dry melting occurs while the melt composition is still dominated by the garnet signature. If however the mantle temperature is 1350 • C there is a second decrease in Dy/Yb at 26 Myr, as more productive dry melting commences (Fig. 4A). If we assume the mantle is dry (models 1350D and 1450D in Table 1) then the REE composition is only matched by the colder model when there is a very small volume of melt (see Supplementary Material).

Comparison with REE trends
In Fig. 4 the evolution of the La/Yb and Dy/Yb ratios approaches the observations after dry melting has initiated and the lithosphere has thinned to less than 50 km (Fig. 3). For a mantle temperature of 1350 • C the model 1350N approaches the observed composition after 29 Myr of model evolution with a mean melt fraction of 0.05 (Fig. 4). For an asthenosphere of 1450 • C the model 1450N approaches the observed composition after 22 Myr of model evolution with a mean melt fraction of 0.08 (Fig. 4).
To further explore which mantle potential temperature best fits the REE signature of the recent Afar lavas, we use a simple measure of maximum likelihood (Wald, 1943), and compare this fit to the χ 2 value for each sample size at the 95 and 70% confidence interval (Fig. 5). For each zone, the data display a clear decrease in source normalised concentration from middle to heavy REE. For our two model scenarios, as the rifting progresses the modelled melt compositions approach those of the observed lavas and then diverge towards more depleted compositions. For all three data sets, the model with a mantle potential temperature of 1450 • C gives the best fit at a time of 22-23 Myr after the initiation of extension, while the cooler model requires a longer duration of extension, 30-35 Myr (Fig. 5).
In detail, for the Erta Ale lavas, the model with a mantle potential temperature of 1450 • C fits well, while the cooler model does not as convincingly re-create REE compositions ( Fig. 5A and D). The hotter model can match the Dabbahu magmatic segment data, however, the cooler model fits better (Fig. 5B and E). The southernmost site, the Asal Rift, which is the first on-land section of the Gulf of Aden, has an REE profile that is best matched by the melting of a 1450 • C mantle ( Fig. 5C and F). The cold model does also approach a close fit to the composition at older ages, yet there is a systematic over prediction in the concentration of REEs (Fig. 5F). A shorter duration of rifting at Erta Ale is in close agreement with plate reconstructions that suggest extension initiation in the southern Red Sea at 24 ± 4 Ma (Reilinger and McClusky, 2011), though slightly younger than the 26 to 29 Ma estimate of Wolfenden et al. (2005). For the Dabbahu magmatic segment, both the shorter and longer duration of rifting are plausible given the age constraints from kinematic reconstructions (Reilinger and McClusky, 2011;Wolfenden et al., 2005). The elevated temperature at Asal is in agreement with previous estimates of mantle temperatures in this region (Rooney et al., 2012b;Pinzuti et al., 2013).

Bulk seismic structure
The comparison of the predicted melt production with the observations yielded two preferred model scenarios: model 1450N at ∼22 Myr of extension and model 1350N after ∼30 Myr. The Afar region is well known for its extreme low-velocity anomalies. A range of surface-wave studies find minimum shear velocities between 3.6 and 4.0 kms −1 between about 50 and 100 km depth (Knox et al., 1998;Debayle et al., 2001;Montanger et al., 2007;Fishwick, 2010;Chang and Van der Lee, 2011). Next, the synthetic seismic structures predicted for the two preferred models are compared to these shear wave constraints. Fig. 6 displays in the left column model thermal structure, with contours for the amount of retained melt, for models with a 1450 • C and 1350 • C mantle temperature and different amounts of melt retention (models 1350N, 1450N, 1450L and 1450T in Table 1). Asymmetry in the zone of partial melting is created due to the 100 km east to west shift in extension after 11 Myr (see Section 2.5). In the central panels, the melt production and melt fraction is plotted against depth for the centre of extension. In the right hand column of Fig. 6, the corresponding shear velocity structure (dependent on temperature, pressure, composition, melt retention and dehydration) is shown.

Modelled seismic structure
The seismic structure is strongly affected by the thickness of the thermal lithosphere. Due to the decreasing ∂ V S /∂ T with increasing pressure (e.g. Goes et al., 2000), adiabatic upwelling of mantle material leads to lower velocities at shallower depths. As a result the cooler model, 1350N (Fig. 6A and B), with the thinner lithosphere (30 km) yields lower minimum velocities than the warmer model 1450N (50 km lithosphere) ( Fig. 6C and D). Note that the effect of the thinner lithosphere is even stronger than the effect of the larger amount of melt in model 1450N (up to 0.5% melt retained) than model 1350N (up to 0.3% melt retained). If melt retention is enhanced, velocities are further decreased. This is illustrated by the model with an order of magnitude lower permeability (1450L), where the maximum melt retention is increased to 0.9%, and the model where melt does not migrate until its porosity exceeds 1% (1450T), where the retained melt fraction is enhanced to 1.1% (Fig. 6E and G). These two models are also plotted at slightly later times than model 1450N (25, 24 and 22 Myr respectively). This contributed to the stronger and wider low velocity zones in the two high melt-retention models.

Comparison with tomographic constraints
We find that all model cases are able to produce low shear velocities in the observed range, due to the combination of strong attenuation at asthenospheric temperatures and the elastic effect on seismic wave speed from presence of melt (Fig. 6). The higher amounts of melt predicted by the models with stronger retention are required to reach the lowest values in the observed range.
Bulk seismic velocities provide a strong indication of a relatively shallow lithosphere-asthenosphere boundary at 50 to 30 km depth. If the lithosphere was thicker than 80 km, as proposed by Ferguson et al. (2013), then seismic velocities would not reach the minimum values observed. A thinning factor of 2 to 3 is therefore required by the seismic constraints and REE patterns. This is within the range of values we estimated to derive igneous thickness estimates in Fig. 3 and of crustal stretching factors inferred for Afar (e.g. Wolfenden et al., 2005). Furthermore, significant fractions of melt only form once the dry solidus is crossed, i.e., melt fractions of around 1%. This requires the lithosphere to have thinned to 50 km from an original thickness outside of the rift of 100 km. Thus, the seismic velocities provide further support for our choice of initial lithospheric thickness, and are consistent with the 1450 • C mantle temperature preferred by the comparison melt volumes and melt chemistry.

S-to-P receiver function constraints
A detailed S-to-P receiver function study of the lithosphere and shallow mantle below Afar (Rychert et al., 2012) found two prominent discontinuities below the rift flanks, one at around 30 km with a positive velocity change (i.e. increase of velocity with depth) attributed to the base of the crust, and one at about 80 km depth with a negative polarity associated with the base of the lithosphere (the 'LAB', lithosphere-asthenosphere-boundary). Within the rift  (Barrat et al., 1998), Dabbahu (Ferguson et al., 2013) and Asal magmatic segments (Pinzuti et al., 2013;Schilling et al., 1992). Parts (A)-(C) show the quality of the fit of the range of REE predicted against the observations, by calculating the misfit using Wald's Test (Wald, 1943). The gray lines display the confidence levels for the quality of fit. Parts (D) to (F) show the pattern of REE compositions at each region and the two model generated best fits. Dashed lines are for hydrous-only melt production. zone however, the negative polarity signal was absent and instead, under large parts of the rift, a positive polarity signal (with an amplitude 15-20% of that associated with the Moho) was found at around 50-70 km depth in addition to the Moho signal at 30 km depth. Rychert et al. (2012) interpreted the latter discontinuity as the base of the dry melt zone with an 8% change in shear wave velocity which has to occur over a depth interval of at most 10 km.

Models: impedance contrasts and receiver functions
Whether a seismic velocity jump leads to the conversions from S-to-P waves that are imaged with receiver functions, depends on the strength and sharpness of the impedance (product of velocity and density) contrast across the boundary. We show impedance gradients for the two asthenosphere temperatures 1350 • C and 1450 • C and with increased melt retention (models 1350N, 1450N, 1450L, 1450T in Table 1) in Fig. 7 and as 1-D profiles through the centre of extension in Fig. 8D (models 1350N, 1450N, 1450L). By far the strongest impedance contrast is that predicted for the Moho, while a negative impedance contrast characterises the base of the dehydrated lithosphere. A positive impedance contrast develops near the base of the melt zone, but its amplitude is only a very small fraction (<1%) of the Moho contrast even when we impose a threshold of 1% for melt transport (Fig. 7).
Vertical profiles of attenuation, shear velocity structure and shear impedance gradients directly below the centre of the model rift illustrate the velocity jumps (Fig. 8). This figure includes an additional model with low permeability where melt is assumed to have an additional attenuating anelastic effect. The models span a large range of potential melt retention and seismic sensitivity to the presence of melt. In spite of this, in all cases, the transi-tion from wet to dry melt regions remains too gradual to cause a strong impedance jump.
We calculate a synthetic receiver function for model 1450N and model 1450L (Fig. 9). At the rift axis we predict a profile with two positive peaks, separated by a negative one (Fig. 9B). The first positive and negative peaks of the synthetic receiver function are due to the Moho velocity contrast. The negative Moho side lobe is further enhanced by a negative peak due to the base of the dehydrated lithosphere, which is accompanied by a deeper positive side lobe. The base of the dry melt zone generates a negligible signal in the 1450N model and gives only a very minor contribution to the deepest positive swing in the 1450L model.

Comparison: impedance and receiver functions
Although higher melt retention and increased attenuation of the melt enhance the velocity contrast across the base of the dry melt zone, none of the models produce an impedance contrast that comes close to the sharp 8% increase in V S inferred from the actual receiver functions. Furthermore, this additional melt effect on attenuation reduces the V S to very low speeds, significantly below the lowest observation from surface waves (Fig. 8). Not even when all melt is retained, or productivity increased by reducing the solidus-depletion gradient, ∂ T s /∂ F | z (see Supplementary Material), from 300 to 200 • C, which deteriorates the fit of the REE data, are we able to significantly increase the impedance jump.
The expression of the inferred 8% jump in the receiver function is the absence of the negative lobe due to the Moho. This requires a shallow (between 50 and 70 km depth) positive discontinuity to largely cancel this part of the Moho signal (Rychert et al., 2012). The impedance contrast from our modelled base of the silicate (dry) melt zone occurs deeper in our warmer models, but ). (G, H) Model 1450T with reference permeability, but where melt is immobile if the porosity is below 1%. Maximum melt retention ranges from less than 0.5% (panels A, C) to 1.1% (panel G). All cases give minimum V S consistent with those imaged below Afar (see text). Seismic velocities are more sensitive to depth of the lithosphere-asthenosphere transition than to potential temperature in the range from 1350 to 1450 • C. The slight asymmetry in the zone of partial melting is due to the 100 km east to west shift after 11 Myr in the centre of extension across the Danakil block. Models 1450N, 1450L and 1450T are plotted at the same numerical time step, which corresponds to slightly different duration's given the differing convective vigour between the models. (For interpretation of the colours in this figure, the reader is referred to the web version of this article.) even in the 1350N model it is insufficient to mask the negative side lobe from the Moho conversion.
A further difference between our profiles and the structure that the receiver functions found below the rift zone is that the negative polarity jump due to the base of the lithosphere is also present (Fig. 9). It has been proposed by a range of seismic and resistivity studies that the lithosphere below the rift is heavily melt intruded by as much as 3% melt (Stork et al., 2013;Desissa et al., 2013). This melt is thought to occupy a large region at and below the Moho beneath Dabbahu segment (Desissa et al., 2013). If the amount of melt in the lithosphere is at least as large as that in the asthenosphere directly below, the velocity contrast at the lithospheric base could be cancelled. Thus, the absence of an 'LAB' type discontinuity in the receiver functions would further confirm that the lithosphere below the rift is meltfilled. We find that a 1% melt porosity above the zone of partial melting can mask the impedance jump between the dehydrated lithosphere and damp, melt-filled underlying mantle. The synthetic receiver functions also predict a Moho jump that is too strong, unless we assume a low velocity lithosphere. One might argue that such a melt-filled lithosphere should not be called a lithosphere, as it is no longer seismically distinct. However, it may still behave different rheologically than the asthenospheric mantle below, leading to the relatively narrow zone of shallow vertical and along-axis alignment of seismic anisotropy Hammond, 2014). The assumption that there is 1% melt in the lithosphere creates a synthetic receiver function which is closer to the observation (Fig. 9). Yet this assumption is somewhat ad hoc and still cannot replicate the signal previously attributed to the base of the dry melt zone. Therefore, the comparison with receiver functions does not allow us to further narrow down the plausible model scenarios.

Melt chemistry
Overall, 1450 • C and a rift age of 22 to 23 Myr gives the best match to the REE data and crustal thickness. Models with a more depleted source composition fit the data but yield the same best-fit age and temperature. The variable quality of the fits to the observed lavas between rift zones may reflect some lateral variations in temperature and rift onset times.
In all of our modelled melts there is a systematic underestimation of the concentration ratios between the most incompatible elements, such as La and Ce, when compared to the observed lavas. The positive slope between La and Pr in Fig. 5 is because of model source depletion. It is possible that our assumption that melt and solid are in equilibrium for a particular model time step, which has a typical duration of 40 kyr, allowing a simple mass balance to be used to estimate the solid composition as melting progresses is not valid. The inverse model used to model REE melt compositions by Ferguson et al. (2013) for example assumes fractional melting throughout. The other possibility is that there is a greater flux of fertile mantle through the melt zone due to a buoyant mantle source, such as a thermal plume beneath Afar that the present model does not capture, which leads to less source depletion.

The onset of melt retention and seismic discontinuities
Our models indicate that the positive discontinuity amplitude that S-to-P receiver functions image at a depth of 50-70 km below the rift zone cannot be a simple consequence of the isotropic effects of melt on seismic velocities. Different melting models, like that developed by Katz et al. (2003), also predict a gradual onset of melting and hence a gradual reduction in V S and V P , especially for melting in the presence of water.
A recent 1-D modelling study (Havlin and Parmentier, 2014) suggests that strong discontinuities associated with melting can be produced in a mantle of higher viscosity and/or lower water content or at high upwelling rates. Our models are not able to match melt constraints using a dry mantle, and it would thus be hard to argue for significantly higher viscosity. Higher upwelling rates provide faster influx of new fertile material and hence enhance melt production and for the same permeability also increase melt retention. According to the models of Havlin and Parmentier (2014), upwelling rates of several tens of cm/yr are required. Using the same numerical model as presented here, Goes et al. (2012) found melt retention increased to 1.3% when the half spreading rate and hence vertical flow was roughly 6 times greater, insufficient for a strong impedance contrast at the onset of dry melting. It is unclear that the conditions under Afar, with much lower spreading rates and potential active upwellings with relatively mild excess temperatures, could produce fast enough upward flow to allow formation of an 8% shear-velocity discontinuity. If so, the receiver function signal could be further support for a hotter mantle with active upwellings, although it may not be a direct effect of the onset of dry melting, which at higher temperatures occurs deeper than 50-70 km depth. Furthermore, increased upwelling at a mantle temperature of 1450 • C, which is consistent with the melt chemistry, would increase crustal thickness beyond that observed.
While it is not easy to explain the velocity increase at 50-70 km depth below Afar by the onset of melting, similar S-to-P observations in a range of hotspots (e.g. Hawaii, Galapagos; Rychert et al., 2013Rychert et al., , 2014 do indicate that the discontinuities are a signal of melt in the mantle. Another possible mechanism to sharpen the impedance contrast could be a change in seismic anisotropy with depth, a similar mechanism to that which is suggested below  Fig. 6). (B) Attenuation structure for the three models in panel A, and a fourth model where the low permeability is combined with an increased attenuation due to the partial melt retained in the mantle (dashed line). (C) Shear wave velocity for the four models in panel B with the range of observed shear wave velocities extracted from surface waves below Erta Ale, Dabbahu and Asal (Fishwick, 2010). (D) Shear impedance-depth profiles for the same four models. None of the cases give a velocity and impedance contrast approaching that inferred from Afar receiver functions (about 8% change in V S over <10 km and about 20% of the Moho impedance contrast, respectively Rychert et al., 2012). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) cratons (e.g. Yuan and Romanowicz, 2010). Melt that retains a preferential orientation is an efficient mechanism to generate seismic anisotropy (Blackman and Kendall, 1997), and so a rapid change in the style of preferred orientation of melt with depth could cause a sharp discontinuity. In rift settings a number of mechanisms exist that could align melt. Where significant topography exists on the lithosphere-asthenosphere boundary, high strain rates can cause melt bands to form (Holtzman and Kendall, 2010). Also, the alignment of melt as it migrates through lithosphere (Faul, 2001) or in the presence of vertical flow in the asthenosphere (Blackman and Kendall, 1997) will likely cause a seismic anisotropy. Shear-wave splitting and P -wave receiver function studies show that vertically aligned melt exists at the rift axis and rift margins in the crust and lithospheric mantle beneath Afar Hammond, 2014), whereas much of the asthenospheric mantle beneath Afar has little shear-wave splitting suggesting horizontally aligned or no preferential alignment of melt . A change in alignment of melt between the lithosphere and asthenosphere or due to the onset of melt mobilisation, which enhances preferential orientation as melt fraction approaches 1%, could cause the discontinuity seen in S-wave receiver functions.

Conclusions
This study suggests that mantle temperatures are elevated below Afar and the lithosphere has significantly thinned. This agrees with aspects of previous geochemical and seismic interpretations: Rare-earth-element concentrations from mafic magmas point to melt in the garnet stability field, at elevated temperatures 1450 • C. However, rather than requiring a thick lithosphere (Ferguson et al., 2013), our models reproduce the REE data with the addition of hydrous melting. S-to-P receiver functions imaged a seismic discontinuity at ∼50-70 km depth which was attributed to the base of the dry melt zone, requiring a thinned lithosphere (Rychert et al., 2012). To also fit constraints on volumes of magma in the crust, our models require that rifting started at ∼23 Ma, and the mantle is ∼1450 • C, leading to lithospheric thinning by about a factor of 2, to around 50 km, and significant decompressional melting. However, in none of the models are the isotropic effects of retained melt on seismic velocity able to recreate the strong and sharp change in shear speed found in the receiver functions at 50-70 km depth. Alternatively, we propose that the discontinuity Fig. 9. Synthetic receiver functions for the forward model compared with a characteristic rift receiver function from Rychert et al. (2012). To be able to make a fair comparison with the receiver function derived from observations, we set the Moho depth from 20 to 30 km to be consistent with the published receiver function below Afar (Rychert et al., 2012). (A) Shear wave velocities through the centre of extension for the model 1450N (red dashed line), the same model where we assume 1% melt is in transport within the lithosphere above the zone of partial melting (red solid line) and the model 1450L with additional attenuation due to the presence of melt (black solid line). These vertical profiles are compared against surface wave estimates of V S from Fishwick (2010) in light blue. (B) Synthetic receiver functions from the model 1450N (red dashed line), model 1450N with melt above the zone of partial melting (red solid line) and the model 1450L with additional attenuation due to melt (black solid line). These synthetic receiver functions are compared to the binned receiver function from Afar in light green, with statistical errors estimated from the 95% confidence interval from a bootstrapping exercise Rychert et al. (2012). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) is a consequence of a change in seismic anisotropy due to melting. In other respects, extension above a mantle with elevated temperatures reconciles the observed magma chemistry, estimates of magmatic crust volumes and minimum mantle shear velocities below the Afar rift.