How Do Rift‐Related Fault Network Distributions Evolve? Quantitative Comparisons Between Natural Fault Observations and 3D Numerical Models of Continental Extension

Continental extension is primarily accommodated by the evolution of normal fault networks. Rifts are shaped by complex tectonic processes and it has historically been difficult to determine the key rift controls using only observations from natural rifts. Here, we use 3D thermo‐mechanical, high‐resolution (<650 m) forward models of continental extension to investigate how fault network patterns vary as a function of key rift parameters, including extension rate, the magnitude of strain weakening, and the distribution and magnitude of initial crustal damage. We quantitatively compare modeled fault networks with observations of fault patterns in natural rifts, finding key similarities in their along‐strike variability and scaling distributions. We show that fault‐accommodated strain summed across the entire 180 × 180 km study area increases linearly with time. We find that large faults do not abide by power‐law scaling as they are limited by an upper finite characteristic, ω0. Fault weakening, and the spatial distribution of initial plastic strain blocks, exert a key control on fault characteristics. We show that off‐fault (i.e., non‐fault extracted) deformation accounts for 25%–45% of the total extensional strain, depending on the rift parameters. As fault population statistics produce distinct characteristics for our investigated rift parameters, further numerical and observational data may enable the future reconstruction of key rifting parameters through observational data alone.

blocks developed from strain localization potentially host hydrocarbons (Sorkhabi & Tsuji, 2005) and/or allow for carbon capture storage (Kaldi et al., 2013).Complex normal fault networks are also favorable for geothermal energy production (Coolbaugh et al., 2006;Faulds & Hinz, 2015).Our understanding of how fault populations are characterized and how they evolve through time is thus crucial to the identification of potential energy storage and waste sites.
It has been difficult to deduce the true scaling nature of fault networks, as data are usually obtained from mapping which contains an insufficient range of fault numbers (Cladouhos & Marrett, 1996;Scholz, 1997).Often fault observations span only one order of magnitude in terms of fault scale (Bonnet et al., 2001) and the limited spatial extent incurs sampling biases (Pickering et al., 1995).The finite size of data sets leads to underestimation of small faults due to limited resolution (truncation) and undersampling of the largest faults (censoring), which overall alter the appearance of the distribution (Figure 2) and may lead to erroneous conclusions of the style and factors controlling fault patterns.Previous statistical strain analyses are often presented in 1D (through borehole images and well logs) or 2D (fault lengths in outcrop or summing heaves across cross sectional transects; Cowie et al., 1995;Gupta & Scholz, 2000;Marrett & Allmendinger, 1992;Torabi & Berg, 2011;Walsh & Watterson, 1991).It remains a challenge to extrapolate results obtained from 1D and 2D data sets to 3D systems (Bonnet et al., 2001).
In addition to the lack of spatially extensive data sets in active and ancient rifts, there are even fewer temporally constrained data sets.Earthquakes only provide a short snapshot in time relative to the long geological timescales in which their displacement accumulates.Few faults are associated with age-constrained growth strata that record fault activity over geological timescales (e.g., Meyer et al., 2002).The lack of kinematic constraint has led to different models of how individual faults grow (e.g., Pan, Bell, et al., 2022;Rotevatn et al., 2019;Walsh et al., 2002).Furthermore, it is unclear how other forms of deformation (i.e., diffuse, ductile or aseismic slip) are accommodated during extension.Limitations in spatial and temporal resolution from observational data means there is little understanding of how fault networks evolve quantitatively, and even less understanding of how underlying rift dynamics may govern the geometry and kinematics of natural rift fault networks.While numerical and analog models are increasingly able to resolve fault growth and mechanics in 3D, they may not account for large lithosphere-scale parameters, such as the effect of thermal variation in the crust (e.g., Finch & Gawthorpe, 2017;Zwaan et al., 2021).
Here, we produced five high resolution (<650 m), 3D thermo-mechanical forward models of lithospheric deformation to investigate the effect of well known, first-order rift controls, such as extension rate and rheology, on the structural expression of normal fault network evolution.The along-strike variability revealed by the 3D models enable key fault attributes, such as the length, active and cumulative strain, and strike, to be extracted across a range of spatial scales throughout their evolution.We characterize the scaling distribution of fault accommodated strain through time, and investigate how different kinematic and rheological model parameters affect the overall geometry of the fault network in 3D.We compare differences between measured crustal strain in the models with the summation of strain accommodated by normal faults, resulting in off-deformation percentage estimates for each model.Finally, we compare the modeled fault statistics with those observable in the Earth's crust, allowing us to better understand and potentially infer first-order controls on rift dynamics, based on observed fault patterns.PAN ET AL.

Model Description and Setup
The open-source dynamics code ASPECT (Heister et al., 2017;Kronbichler et al., 2012) is used to model 3D continental extension.The thermomechanical model solves the incompressible Boussinesq approximation of momentum, mass and energy equations, combined with advection-diffusion equations.For a full description of the governing equations, see Naliboff et al. (2020) and Pan, Naliboff, et al. (2022).The governing equations are solved on a 3D gridded domain spanning 500 × 500 × 100 km (X, Y, and Z respectively).The grids are 5 km resolution at the sides and base of the model, and we use an adaptive mesh refinement to successively reduce this down to 625 m over a 180 × 180 × 20 km region in the center of the model.Velocities are prescribed to the left and right of the model.Inflow at the model base balances the outflow, the front and back faces are free-slip, and the top of the model is a free surface (Rose et al., 2017).
The model contains three compositional layers: the upper crust (0-20 km), the lower crust (20-40 km), and the mantle lithosphere (40-100 km).Viscous flow laws for dislocation creep are wet quartzite (Gleason & Tullis, 1995), wet anorthite (Rybacki et al., 2006) and dry olivine (Hirth & Kohlstedf, 2003) containing background densities of  2,700, 2,800, and 3,300 kg m −3 .The temperature distribution follows a characteristic conductive geotherm for the continental lithosphere (Chapman, 1986) where the temperature at the base of each layer is 633, 893, and 1,613 K (for parameters and assumptions used to solve for the conductive profile, see Table S1 in Supporting Information S1).
The compositional layers deform through a combination of nonlinear viscous flow (following dislocation creep) and brittle plastic deformation (Glerum et al., 2018).Brittle deformation follows a Drucker Prager yield criterion: where the initial friction angle (ϕ) and cohesion (C) are 30 and 20 MPa, and they linearly weaken as a function of finite plastic strain (see Figure 3).Following Duretz et al. (2020) the Drucker Prager yield criterion is modified further to include a plastic (viscous) damper, which acts as a stabilization term for the shear band width and produces mesh-independent results, providing a sufficient resolution to resolve the damper viscosity (1e21 Pa s here).Note that the entire extracted region (5 km beneath the initial model surface) is yielding (Figure S1 in Supporting Information S1), hence our study tracks frictional-plastic strain only.
Instead of a single zone of initial weakness, a distribution consisting of binary strong and weak blocks of plastic strain are prescribed in the upper crust, following Pan, Naliboff, et al. (2022).The statistically randomized method geologically mimics variations in the initial strain field that may reflect the mechanical properties of the rifting plate (e.g., Scholz & Contreras, 1998), the presence of the pervasive and/or discrete fabrics (e.g., Versfelt & Rosendahl, 1989) and stronger regions such as cratons (e.g., Buiter & Torsvik, 2014;Dunbar & Sawyer, 1989;Tommasi & Vauchez, 2001;Versfelt & Rosendahl, 1989;Ziegler & Cloetingh, 2004).Overall the initial strain field promotes the development of a complex fault network displaying the along-strike variability in strike and displacement observed in natural rifts.
In this paper, the plastic strain in the reference model (Model A) consists of strong and weak values of 0.5 and 1.5 spatially over 2.5 km 2 blocks, with the friction angle and cohesion weakening by a factor of 2 over the plastic strain interval.Relative to the reference model, we investigate the effect of an increased spatial wavelength of 5 km blocks (Model E; Figure 3), weakening the friction angle and cohesion by a factor of 4 (Model D) and reducing the contrast between blocks to 0.5 and 0.6 (i.e., faster brittle weakening; Model C).We also investigate extension rates of 1.25 and 5 mm/yr (see Figure 3); these correspond to a range comparable to those characterizing slow active rifts (e.g., East Africa, Basin and Range) using GPS data (e.g., Argus & Heflin, 1995).

Fault Network Extraction
Fault analysis is sampled on a horizontal plane located 5 km beneath the initial model surface as the full extent of the fault network is captured at this level (Figure 5).Fault statistics are extracted using an image-processing workflow from Pan, Naliboff, et al. (2022), which derives the strain profile gradient from the active deformation field.This approach effectively defines a fault as a region of clearly localized active slip, and enables the extraction of discrete fault segments within a complex, interacting fault network (Figure 5c).Each fault label contains a unique accessible index, and each label acts as a mask in order to extract active and finite strain, as well as to compute geometries such as fault length and strike.In Pan, Naliboff, et al. (2022), fault labels are thinned so that they are one element width along the fault in order to derive the cumulative euclidean distance (i.e., fault length).
In contrast, here the thinned labels are later enlarged (morphologically dilated using a structuring element-see Supporting Information S1 for full details), in order to capture the full extent of shear zone deformation, which typically span 2-3 grid elements (Figures 5c and 5d).The total strain occupied and extracted by each fault label is therefore equivalent to the geometric moment, M g (see next Section 2.3).

Fault Strain Summation
Incremental strain is often calculated in earthquake seismology using the Kostrov summation method (Kostrov, 1974;Molnar, 1983).Similarly, strain on faults over geological timescales can be determined using the geometric moment, M g : where u ave is the average displacement and n is the fault surface area (e.g., Cowie & Scholz, 1992;Meyer et al., 2002), which in 2D plan view here represents the fault trace length.The geometric moment is directly related to the seismic moment (which is the geometric moment divided by the shear modulus; Scholz, 1989) allowing for comparison of faults active over geological timescales with the Gutenberg-Richter power-law population of characteristic earthquake populations (Gutenberg & Richter, 1944;Kanamori & Anderson, 1975;Scholz & Cowie, 1990).Summation of the geometric moment divided by the area of regional interest (V) allows for an estimation of strain, ε (Marrett & Allmendinger, 1990;Molnar, 1983): The above equation results in the total strain accommodated by the fault network (i.e., Figure 5e).In addition to the strain occupied by our extracted faults (i.e., Figure 5d), we also sum the total crustal strain, ε (i.e., Figure 5a) and active strain (i.e., Figure 5b) over the central 180 × 180 km high-resolution zone.As a result, the difference between the total crustal strain and fault strain equates to off-fault deformation (OFD), or OFD (see Figure S2 in Supporting Information S1).
While fault segment extraction is necessary to plot cumulative fault distributions, discrete faults are not necessary to compute OFD.While the fault skeleton is widened to incorporate the full extent of shear zone deformation (Section 2.2), the amount of widening is expected to play a role in the measurement of OFD.We therefore explore the impact of how fault extraction may be affect OFD by comparing: (a) the method by Pan, Naliboff, et al. (2022) with conventional binary thresholds (Figures S3 and S4 in Supporting Information S1); and (b) widening the fault skeleton with different structural elements to explore its impact on OFD across the reference Model A (Figures S3 and S4 in Supporting Information S1).For consistency, we present the Pan, Naliboff, et al. (2022) method in the main results.

Fault Scaling Distributions
The equation of a power law fault population distribution (e.g., Childs et al., 1990;Kakimi, 1980;Scholz & Cowie, 1990;Villemin et al., 1995;Walsh & Watterson, 1991) follows: where ω is the measure of size (e.g., length, displacement or geometric moment), N is the cumulative number of values ≥ω, A is a constant and α is the exponent.A log transformation of this equation gives a linear relationship between log N and log ω, with the slope gradient representing the power law component, α (Figure 2).
An exponential scaling distribution better describes fault length distributions (Ackermann et al., 2001;Cowie et al., 1993;Spyropoulos et al., 1999) and follows: where the exponential law incorporates a characteristic scale ω 0 , which may reflect a physical length in the system such as layer thickness (Cowie, 1998).
The gamma distribution is more commonly used in earthquake statistics (Davy, 1993) but it also characterizes fault trace lengths in the Gulf of Corinth rift (Michas et al., 2015).The gamma law follows: The gamma law distribution combines a power law from Equation 4with an exponential tail from Equation 5(Figure 2).
We used curve fitting and applied a non-linear least squares fit to find the optimal set of parameters for the defined function (Equations 4-6).We use the geometric moment (see Section 2.3) in order to analyze the scaling distribution of strain, where the initial input variables for A, α, q, and ω 0 are the fault number, 1, 0, and the mean fault strain, respectively (following a similar methodology from Cowie et al. ( 2012)).For clarity, only the best fitted distribution is shown in later figures, however the parameters for other distributions are shown in Figure 6.We assume that the smallest 30 faults are affected by truncation as the data points characteristically flatten off at this point in all models (Figure 6).The smallest 30 faults are therefore discarded for curve fitting, and the remaining number of faults is sufficiently large across several orders of magnitude (between 200 and 500 faults depending on the model) to provide a robust statistical analysis.We further discuss the implications of sampling bias in the discussion.

Modeled Fault Network Evolution
The general behavior of the modeled fault network and their corresponding statistics are first described using reference Model A. All models show that fault network template is established from the initial onset of extension (Movie S1).In the first resolvable time increment (at 0.1 Myrs), the reference Model A consists of a diffuse network of deformation across the model domain (Movie S1).From 0.1 to 0.5 Myrs, active deformation shows a reduction in element width as shear zones actively localize on the fault template-we attribute this behavior as the full establishment of fault pattern lengths.After 1 Myrs of extension, a number of fault maximas appear randomly distributed throughout the model, and strain predominantly accumulates on the initial maxima points (Movie S1).The evolution of all models demonstrate similar general behavior of widespread strain localization, from an initially distributed deformation field, to a localized, large normal fault array (e.g., Cowie et al., 2000;Gawthorpe et al., 2003).While all investigated models reveal that fault patterns are established early and strain localizes with extension, the fault pattern template and style of subsequent strain deformation exhibit considerable variability across the investigated rift parameters (e.g., Figure 4).
The cumulative frequency of fault strain of reference Model A (Figure 7a) shows a curved distribution similar to many other natural observations of fault scaling (e.g., Meyer et al., 2002;Soliva & Schultz, 2008).Power-law, exponential scaling, and gamma distributions all fit the observational data to R 2 ≥ 0.99, although the high R 2 value is attributed to the logarithmic nature of the data set.We find that the gamma has the highest R 2 , and critically, best matches the largest faults in the model (Figures 6 and 7a). Figure 7a shows that up to 1 Myrs extension, the total number of faults increases as the strain increases-we correlate this to an "initiation" phase where deformation quickly localizes and the lengths of the fault network pattern are fully established.For the remainder of extension, the fault distribution profile moves toward the right as faults increasingly accommodate strain, and the total number of faults stays relatively constant (Figure 7a).The parameters of best fit show that ω 0 increases with extension, and q ranges between 1 and 3. α stays relatively constant throughout time, ranging between 0.1 and 0.3 (Figure 7a; Movie S2).
While the central portion of the fault distribution progressively accommodates strain, reflected by stable increases in α, faults accommodating the highest strain are variable throughout extension, where the magnitude of highest strain periodically fluctuates from the q-gamma fit trendline (Figure 7a; Movie S2).We note that the periodic fluctuations of the fault distribution are sometimes initially marked by a break in slope on the log plots (e.g., see 1 and 3 Myrs distribution on Figure 7a) resulting in the appearance of a prominent tail.The fluctuations above (i.e., strain greater than) the trendline correspond to continuous active slip events, where the summed strain along large fault lengths is high.Conversely, deviations below the trendline suggest that the strain intermittently accommodated by faulting is relatively distributed throughout the crust (rather than accommodated by a few large faults).The total summed strain accumulation, accommodated by the extracted fault network, increases approximately linearly through time, however there are small temporal fluctuations which reflect the oscillations between distributed and localized deformation (Figure 7b).

Extension Rate
Models deforming at constant full rates of 1.25 and 5 mm/yr (Models A and B; Figure 3) allow for the investigation of extension rate on fault network evolution.The models show that higher extension rates (5 mm/yr) correspond to an overall higher magnitude of plastic strain deformation over increasingly diffuse regions (Figures 8a  and 8b).Our results also show that an increase in extension rates do not affect the overall spatial pattern of finite strain (Figures 8a and 8b).For example, fault maximas and localization occur on the same fault systems no matter the extension rate, although to some extent the fault strain magnitudes and shear width of deformation are variable across the investigated models (Figures 8a and 8b).
Similar to the other investigated parameters, fault strain distributions for all models investigating extension rates abide by a q-gamma distribution (R 2 > 0.99; Figure 8).The distribution appears broadly similar across the investigated models, whereby greater extension rates accommodate higher strain throughout extension, so plot further toward the right in Figure 8.For both models, ω 0 increases with extension, but overall magnitudes are higher for the 5 mm/yr model-at the final 5 Myrs timestep shown, ω 0 is 2.7 × 10 6 , 3 times higher than that of the 1.25 mm/yr model at 8.7 × 10 5 (Figure 8).For the 1.25 mm/yr model, α steepens from 0.21 to 0.07 from 1 to 2 Myrs, respectively, before increasing for the remainder of extension (Figure 8a).For the 5 mm/yr model, alpha increases (i.e., the central portion of the slope shallows) with extension, and reaches a relatively higher value of 0.32 by 5 Myrs extension (in contrast to 0.2 for the 1.25 mm/yr model).

Rheology
Our results indicate that changes to the amount and effectiveness of frictional plastic strain weakening in the brittle upper crust exert the most significant control on fault patterns, both visually due to along-strike variations in strike, strain, length and density, and statistically due to differences in their scaling distributions (Figure 10).Most notably, models with relatively weak faults (Models C and D) contain less faults, thus strain is localized onto fewer, larger structures (Figures 9b and 9d).The cumulative strain profiles of these two weak models reflect their localized behavior, where the gradient of strain distribution trends shallower relative to the reference model (Figure 10).The q-gamma fitted parameters show that highly localized fault populations such as Models C and D both exhibit lower α values of c. 0.03 and 0.09 and thus shallower trends for the central portion, and contain higher values of ω 0 .Although the cumulative strain distributions are comparable between the two weak models (Model C and Model D), their fault template are significantly different whereby Model D exhibits significantly more pronounced along-strike variability in local azimuth (Figure 9d).
Investigations into the spatial wavelengths within the initial distributions of plastic strain (Model A: 2.5 km 2 or Model E: 5 km 2 ) here reveal that strain scaling distributions appear comparable as both exhibit relatively higher α values (0.27 and 0.2, respectively) and lower ω 0 values (Figures 9a and 9e).However, the faults in Model E overall accommodate less strain magnitude in their scaling distribution (Figure 9e), and the visual style of the pattern of deformation differs significantly between the two models, with larger initial blocks producing more continuous, sinuous fault patterns with less along-strike variation (Figure 9c).

Fault-Accommodated Strain Through Time
Figure 10 shows the total fault accommodated strain through time across all models, and a snapshot of their fault strain distribution at 5 Myrs.For the cumulative distributions in Figure 10a, the geometric moment observed from the NW Shelf of Australia (Meyer et al., 2002;Pan, Bell, et al., 2022)   Overall the 5 rift models reveal that summed fault strain increases linearly through time (Figure 10b) when at constant extension rates.The rate at which summed fault strain increases is variable, and lower rates correlate to lower rates of extension (Figure 10b).Within models of the same 5 mm/yr extension rate (Models A, C, D, E), higher rates of strain accumulation occur in models characterized by relatively weak crust and/or faults (i.e., Models C and D).Overall we find that the range of fault strain accumulation lies within the observational fault data (Figure 10b).
Figure 10a reveals that the strain scaling distribution of faults all exhibit the typical curvature characterized by observational data sets.Fault populations are best fit by a q-gamma distribution but exhibit variable fitted parameters, particularly q and ω 0 .The lower 1.25 mm/yr extension rate model plots closest to the left, consistent with findings that it exhibits lower strain magnitude in comparison to the other models (Figure 10a).In contrast, Models C and D with weaker strain mechanisms plot furthest to the right and thus their faults accommodate the  (Pan, Bell, et al., 2022) and the Timor Sea (Meyer et al., 2002).Total strain accumulation is summed either across transects in the Corinth Rift (Bell et al., 2011) or in 3D (Pan, Bell, et al., 2022).Due to limitations in data and kinematic constraints, a range marked from upper and lower estimates are shown.

Off-Fault Deformation
The summed difference between extracted fault accommodated strain and total crustal strain (e.g., Figure 7) may account for diffuse deformation, which we term as OFD. Figure S2b in Supporting Information S1 shows how the OFD of Model A changes through time.OFD% rapidly decreases within the first 0.8 Myrs, which we correlate to fault organization and initiation (Figure S2b in Supporting Information S1).For the remainder of time, OFD% progressively reduces, corresponding to progressive strain localization onto faults (Figure S2b in Supporting Information S1).A similar model behavior to the reference model is depicted in Figure 11, where the majority of the scatter and outliers in the plot is due to the initiation phase in the first 1 Myrs.An exception is the Model (B) with a slow extension rate (1.25 mm/yr), which shows a reverse trend of fault localization from the onset of extension (Figure 11).On average, Figure 11 shows that OFD% accounts for approximately 40% in the models, ranging from 25% to 45% depending on the investigated rift parameters.We find that the Model (C) with an increased rate (10X with respect to the reference model) of fault weakening contains the lowest proportion (25%), followed by the Model (D) with a large magnitude (4X with respect to 2X in reference Model A) of fault weakening-given the weakening mechanisms this corresponds to the higher occurrence of strain localization observed (Section 3.2.2).

Fault Strain Comparison Between Numerical Models and Natural Observations
The extraction of fault geometry within an entire fault population has enabled us to demonstrate that modeled faults using physical approximations commonly used in long-term tectonic simulations are quantitatively similar to the fault geometries observed in rifts (Figure 10).Modeled fault populations exhibit a curved distribution similar to size distributions observations in the literature for length (e.g., Odling, 1997;Scholz et al., 1993;Yielding et al., 1996), displacement (Knott et al., 1996;Marrett & Allmendinger, 1992) and geometric moment (e.g., Bailey et al., 2005;Meyer et al., 2002; Figure 10a).The total fault strain of the models also quantitatively plot within the range of strain derived for observational data (Figure 10b).Models which experienced 1.25 mm/yr rates of extension lie closest to the observational range for the Exmouth Plateau-taken at face value, this predicts that ancient extension rates are relatively low based on its magnitude strain through time.Although our results have demonstrated that fault strain is largely comparable between models and observations, we find that direct rift inversions to recover rift parameters using summed strain measurements (i.e., Figure 10b) are still not achievable due to uncertainties in (a) measured observational strain, which is subject to interpretative biases and limited spatial resolution (e.g., Bonnet et al., 2001); and (b) overall rift duration, as the structure of active and ancient rifts observed provide only a snapshot in time, and there is often uncertainty on the age of the oldest syn-rift sediments to date the rift age as these deep sediments are rarely drilled.Fault backstripping may enable further data points of strain-time, however the earliest resolvable growth strata preserved in ancient systems rarely precedes 5 Myrs of rifting, therefore additional modeling of longer timescales may be needed for comparison.

Normal Fault Growth and Rift Evolution
The spatial and temporal scales covered by numerical modeling give insight on the earliest stages of fault growth, which are not easily constrained using seismic reflection and borehole data (C. A. L. Jackson et al., 2017).Here, all models establish their finite strain patterns from the nearly onset of extension, that is, faults abide by the "constant-length" model (Stage 1; Figure 12) consistent with Walsh et al. (2002).The rapid establishment of fault patterns within the first <100 kyrs is attributed to the initial randomization of the crustal (plastic) strain field.
Given that the "constant-length" model originally referred to the reactivation of faults due to inherited structures (e.g., Childs et al., 2003;Walsh et al., 2002), the model findings here suggest that sufficient crustal weakness alone may promote the rapid development of the fault template.Our results demonstrate that the weaknesses in the initial plastic distribution control the loci for which strain maxima form (Stage 2; Figure 12) and which eventually become the future sites of displacement accumulation (Stage 3; Figure 12), similar to the findings of Cowie et al. (1995).The behavior of strain localization onto through-going faults is consistent with Sornette and Sornette (1990), Cowie (1998), andGupta et al. (1998), supporting that it is a fundamental characteristic of fault network evolution.While all models undergo strain localization, models with larger (i.e., weaker) brittle strength contrasts between initial binary blocks or a faster rate of brittle weakening (C and D) trend more exponentially from the onset of extension (e.g., Figure 9).The investigations of initial strain distribution in the models highlight the importance of crustal strength variations, which may be distributed randomly, or contain spatial significance that is, pre-existing structures (e.g., Duffy et al., 2015), shear zones (e.g., Phillips et al., 2016) and terrane structures (e.g., Daly et al., 2014).We suggest that the relative and/or bulk strength from structural inheritances play an important role in structural development and may be further investigated through numerical modeling, where the initial strain field can be statistically quantified.

Fault Strain Scaling Distribution Evolution
Despite the assumption that faults abide by a power law scaling, we find that the distribution of the modeled faults is best described by a q-gamma distribution (a power law across moderate sized faults, with an exponential tail for the largest faults within a population), consistent with earthquake and seismic hazard assessments (Davy, 1993;Kagan, 1997;Main, 2000;Sornette & Sornette, 1990) and a study of a naturally occurring fault network in the Gulf of Corinth (Michas et al., 2015).Our results show that faults are not entirely fractal; instead, the distribution curve becomes steeper due to an upper bounding characteristic scale (e.g., Figure 7).We argue that the large bound limit of the fault distribution in our results is real and is not due to censorship as the modeled area (180 × 180 km) is rift scale and is sufficiently large to capture the biggest strain-accommodating faults.Furthermore, the position at which the fall-off occurs is significantly greater than previous deductions of censorship effects, and occurs across all investigated models of different strain magnitudes.We speculate that the upper limit of fault distributions is related to similar upper bounding limits that define observational D-L profiles-faults cannot physically exhibit strain beyond a certain limit due to the thickness and strength of the crust (e.g., Cowie & Scholz, 1992).
Across all investigated model parameters, α remains relatively consistent between 0 and 0.3 and is comparable to similar scaling distributions of geometric moment from the NW Shelf of Australia (α = 0.4-0.5; Figure 10).When fitted with a powerlaw distribution, the scaling exponent is c. 0.7 (consistent with existing literature which ranges between 0.3 and 2 (Cladouhos & Marrett, 1996;Knott et al., 1996;Marrett & Allmendinger, 1992;Schlische et al., 1996;Scholz & Cowie, 1990;Turcotte, 2007;Villemin et al., 1995;Yielding et al., 1996)).Previous numerical modeling (Cowie et al., 1993) and outcrop data (Wojtal, 1986(Wojtal, , 1994(Wojtal, , 1996) ) suggest that the power-law scaling exponent decreases through time, due to the progressive concentration of strain onto large faults.While our models clearly exhibit progressive strain localization, and the fault distributions here are visually comparable to those in aforementioned studies, our study correlates the behavior of fault distribution profiles to the upper-bound exponential scaling characteristic, ω 0 , which increases with time (Figures 7,9,and 11).
Our results demonstrate that while the majority of the fault network distribution steadily accumulates displacement through time (i.e., the plot moves toward the right with consistent α), the tail-end of the distribution (i.e., strain accommodated on the largest faults) is more transient (Figure 7).We correlate the transient behavior of large faults to the complexity of the fault network and interaction of fault segments due to competing stress redistribution (e.g., Cartwright et al., 1995;Dawers & Anders, 1995).Here, the extracted faults exhibit long lengths when they interact, producing strike-continuous slip events, and subsequently strain is redistributed along the entire length of the fault system (across relay ramps) resulting in a high fault strain value.We find that the occurrence of a prominent tail in log plots (e.g., at 2 Myrs on Figure 7; see also Figure 12) may be reflective of linkage events for large faults, consistent with initial discussions from Wojtal (1996).The break in slope on log-log plots and prominent tail is apparent in fault size distribution plots from experimental models (e.g., Ackermann et al., 2001), and field studies on Earth (e.g., Casey et al., 2006;Gudmundsson et al., 2013;Soliva & Schultz, 2008) and other planetary bodies (e.g., Vallianatos et al., 2016).While some studies discuss the presence of a prominent incipient tail (e.g., Gudmundsson et al., 2013;Soliva & Schultz, 2008;Walsh et al., 2003), its relevance is not fully characterized or understood, and we encourage future work to investigate its significance.
Our attempt to characterize scaling laws have highlighted how sampling protocols and fitting procedures can and have led to different outcomes (i.e., exponential, power law or a combination of both), particularly due to the effect of sampling biases.Specifically, the cut-off point used to account for truncation is highly subjective and often assumes a power law scaling distribution a priori (Figure S6 in Supporting Information S1).Historically, geoscientists perform a log-log transform of ω and N(>ω) (Equation 4in Methods; see Figure 2 for log-log transform) and either fit by eye or use a least squares estimation of the resulting straight line after truncating their data (Pickering et al., 1995).Such methods are not statistically validated and can produce inaccurate estimates (Bonnet et al., 2001;Clark & Cox, 1996;Clauset et al., 2009).We suggest that the inconsistency of such procedures have led to different scaling conclusions.For example, analog modeling from Ackermann et al. (2001) found that size distributions changed from powerlaw to exponential.Similarly, observational data from Gupta and Scholz (2000) found that size distributions are power law for low strain, and exponential for high strain settings.In contrast, for fault distributions that appear similarly curved, Cowie et al. (1995), Bonnet et al. (2001), and Soliva and Schultz (2008) describe a transition from widespread distributed faulting to localized faulting as an initially exponential distribution that evolves into a power law scaling distribution.
Overall, the characterization of fault strain distributions has provided insight on how the fault population evolves with time, the characteristics of which are not easily conveyed by conventional displacement-length plots.Given changes in scaling exponent are regarded as precursors to large earthquakes (Smith, 1981) and, for volcanic areas, precursors to eruptions (Gresta & Patanè, 1983), we suggest that future work should be cross-disciplinary and combine our understanding of temporal variations in earthquake distributions with longer-term variations of fault distributions.We suggest that comprehensive, statistically robust investigations on fault scaling distributions which incorporate sampling biases across multiple scales may enable a better understanding of fault evolution and its implication to related geohazard.

Off-Fault Deformation
Our results demonstrate that OFD accounts for at least 25%-45% of total crustal strain, depending on the investigated rift parameters (Figure 11).In the models, OFD may refer to background stretching (i.e., bulk thinning of the crust), fault drag, fault block rotation, or the abandonment of old, inactive faults.These results are consistent with the presence of so-called "hidden" deformation, initially proposed by Kautz and Sclater (1988), where faults accommodated 40%-50% of the known extension in clay analog models, and 70%-80% in sand analog models.
Our results lie within the range of 40% ± 23% OFD determined for the East Californian Shear Zone (Herbert et al., 2014) and within the range of OFD% calculated using ground-penetrating radar and palaeoseismic trench data, where drag folding and fault block rotations accommodated 46% of extension in the Taupo Rift (McClymont et al., 2010).
At the crustal scale, Marrett and Allmendinger (1992) and Walsh and Watterson (1991) find that as much as 40% of the total extension is accounted for by small-scale faulting in seismically imaged extensional basins of the North Sea.In contrast, Scholz and Cowie (1990) argue that small-scale faulting accounted for less than 10% of the total strain budget, and Cowie et al. (1995) reconcile both arguments by proposing that the relative importance of small faults decreases as the total strain increases due to strain localization.As our results show that a significant proportion of strain is accommodated off-fault, we suggest small-scale faulting (estimated through the extrapolation of fault size distributions) is not sufficient to account for total strain, and is only partially responsible for the discrepancy between summed fault strain and extension predicted by thermal subsidence (e.g., Mardsen et al., 1990).In addition, our results show that OFD% progressively decreases to account for strain localization, however we find that the decrease accounts for <10%, and overall OFD% may still remain relatively high by the end of model time (Figure 11).This suggests that OFD in the crust is inherent to fault evolution during continental extension.OFD may be a useful metric for quantifying rift strain in future studies, although further understanding of its link to fault extraction methods and data set resolution (i.e., seismic data, fieldwork, numerical and analog models) would be required.

Conclusions
1. We demonstrate that high-resolution 3D numerical models of continental extension statistically produce geologically realistic faulting patterns.2. Numerical models show that fault patterns are rapidly established from the onset of extension (within <100 ka) and thus abide by the "constant-length" fault growth model.3. We find that distribution of initial strain and rate of fault weakening exert the strongest control on fault pattern variability-this highlights the significant role of pre-existing crustal heterogeneities in controlling rift geometry.4. The scaling distribution of strain is not power law across all scales; instead all modeled fault populations are subject to an upper bound characteristic scale.5.The amount of strain accommodated by faults is quantified, and we find that diffuse, OFD accounts for 25%-45% of total crustal strain, the extent of which depends on rift parameters.6.A robust statistical analysis of fault scaling distributions with consideration of sampling biases across scales is needed to fully deduce fault scaling laws.
7. The characterization of strain distributions combined with a better understanding of OFD% may enable the future inversion of rift controls through observations alone.

Figure 1 .
Figure 1.Examples of naturally observable extensional systems, illustrating the different structural styles of rift basins.

Figure 2 .
Figure 2. Cumulative frequency of a fault population distribution, fitted by power, gamma and exponential scaling laws.Points in dark gray on the left may be interpreted as sampling biases such as truncation and censorship (see right), therefore only the central portion of the slope is typically fitted to a power law distribution.

Figure 3 .
Figure 3. Five models of continental extension, where Model A is the reference model, and we investigate the effects of a lower extension rate (B) and various rheology parameters in the Model (C, D, E).For the models investigating rheology, their schematic weakening mechanisms are indicated on the right (Models B and E follow the same mechanism as reference Model A), and conceptual map view models of block distribution are indicated below.See Section 2 for details and Supporting Information S1 for the full details of modeling parameters used.

Figure 4 .
Figure 4. Active deformation (left) and accumulated brittle plastic deformation (right) at 5 Myrs rifting for Reference Model A (a and b) and Model C (c and d) with increased fault weakening.

Figure 5 .
Figure 5. Strain extraction and summation where (a) the plastic strain field, located 5 km beneath the initial model surface; (b) the strain rate second invariant, documenting the active rate of deformation at these shallow crustal depths; (c) discrete fault segments are extracted assigned a unique, accessible index; (d) labels masked over the cumulative plastic strain field equate to strain accommodated by faults; (e) fault lengths; (f) the geometric moment, which averages the plastic strain along its fault length.

Figure 6 .
Figure6.Curve fitting the powerlaw, exponential and q-gamma scaling distributions to the fault network of Modal A at 5 Myrs extension.

Figure 7 .
Figure 7. Reference Model A evolution statistics, showing (a) the cumulative scaling distribution of strain per fault at 0.1, 0.5, 1, 2, 3, 4, and 5 Myrs extension, and (b) the summation of fault accommodated strain, and the summation of total strain in the crust.Strain-time values at 0.1, 0.5, 1, 2, 3, 4, and 5 Myrs extension are colored corresponding to timesteps highlighted in panel (a).The difference between the two plotted trendlines is equivalent to the off-fault deformation.Note that the first 30 datapoints in panel (a) are discarded for fitting, the distribution of which is shown by the dashed gray line.

Figure 8 .
Figure 8. Models investigating extension rates of (a) 1.25 mm/yr and (b) 5 mm/yr models, respectively.The upper panels show the active deformation field (log scale) at 4 Myrs.The bottom panels show the evolution of fault strain scaling distributionsat 1, 2, 3, 4, and 5Myrs.The first 30 points have been discarded (marked in gray) for curve fitting.Scaling parameters are indicated in the legend.
Figure10shows the total fault accommodated strain through time across all models, and a snapshot of their fault strain distribution at 5 Myrs.For the cumulative distributions in Figure10a, the geometric moment observed from the NW Shelf of Australia(Meyer et al., 2002; Pan, Bell, et al., 2022)  are shown.Fault strain fromPan,  Bell, et al. (2022)  is summed and divided by the studied area size (1,200 km 2 ) for a dimensionless comparison of strain.Similarly fromBell et al. (2011), 2D cross-sectional strain across three transects across the Corinth Rift are divided by the rift width.Due to a range of uncertainty on the strain summation and age of rifting, we plot upper and lower bound estimates (see Supporting Information S1 for values).

Figure 9 .
Figure 9. Models investigating how rheological parameters effect fault patterns.The upper panel shows active deformation at 2.5 Myrs, and faults are colored by geometric moment.The bottom panel shows the corresponding scaling distribution of strain.We compare the reference model (a) where initial blocks are 2.5 km 2 blocks, the friction and cohesion angle weaken by 2X, and contrast between initial blocks is between 0.5 and 1.5.(b) The subsequent models vary one parameter where block contrast is 0.5-0.6 thus faults weaken at a greater rate, (c) the initial wavelength of faults is over 5 km 2 , and (d) the friction and cohesion angle weaken by 4X.

Figure 10 .
Figure 10.Comparison of strain between models and observations, as (a) Strain (geometric moment) scaling distribution at 5 Myrs; and (b) Total strain accumulation accommodated extracted faults versus time.Known observational geometric moment distributions are plotted for the Exmouth Plateau(Pan, Bell, et al., 2022)  and the Timor Sea(Meyer et al., 2002).Total strain accumulation is summed either across transects in the Corinth Rift(Bell et al., 2011) or in 3D(Pan, Bell, et al., 2022).Due to limitations in data and kinematic constraints, a range marked from upper and lower estimates are shown.

Figure 11 .
Figure 11.Percentage of strain that is accommodated by off-fault (i.e., non-extracted) deformation.The difference between the summed crustal strain and summed fault strain is calculated for each timestep.See Figure S2 in Supporting Information S1 for the off-fault deformation of the reference Model A. Each point represents one timestep.

Figure 12 .
Figure 12.Schematic evolution of fault distribution and network evolution.