The following article is Open access

A Statistical Search for a Uniform Trigger Threshold in Solar Flares from Individual Active Regions

, , and

Published 2023 May 9 © 2023. The Author(s). Published by the American Astronomical Society.
, , Citation Julian B. Carlin et al 2023 ApJ 948 76 DOI 10.3847/1538-4357/acc387

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

This article is corrected by 2023 ApJ 953 120

0004-637X/948/2/76

Abstract

Solar flares result from the sudden release of energy deposited by subphotospheric motions into the magnetic field of the corona. The deposited energy accumulates secularly between events. One may interpret the observed event statistics as resulting from a state-dependent Poisson process in which the instantaneous flare rate is a function of the stress in the system and a flare becomes certain as the stress approaches a threshold set by the microphysics of the flare trigger. If the system is driven fast, and if the threshold is static and uniform globally, a cross-correlation is predicted between the size of a flare and the forward waiting time to the next flare. This cross-correlation is broadly absent from the Geostationary Operational Environmental Satellite (GOES) soft X-ray flare database. One also predicts higher cross-correlations in active regions where the shapes of the waiting time and size distributions match. Again, there is no evidence for such an association in the GOES data. The data imply at least one of the following: (i) the threshold at which a flare is triggered varies in time; (ii) the rate at which energy is driven into active regions varies in time; (iii) historical flare catalogs are incomplete; or (iv) the description of solar flares as resulting from a buildup and release of energy, once a threshold is reached, is incomplete.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Broad consensus exists that the microphysical process triggering individual solar flares is magnetic reconnection in the corona. A reconnection event becomes more likely to occur the more magnetic energy accumulates in the time between flares (Priest & Forbes 2002; Benz 2016). This phenomenological class of stress–relax model was popularized by Rosner & Vaiana (1978) and expounded by Wheatland & Glukhov (1998), Wheatland (2000a, 2008), Kanazir & Wheatland (2010), and Hudson (2019), among others. A complementary phenomenological description is the avalanche or self-organized criticality model (Lu & Hamilton 1991; Lu et al. 1993; Lu 1995a), inspired by the canonical sandpiles of Bak et al. (1987, 1988). These two classes of model are broadly compatible, but they make different predictions about some long-term statistical observables (Lu 1995b; Boffetta et al. 1999; Wheatland 2000a; Lippiello & Arcangelis 2010; Farhang et al. 2019; Aschwanden & Johnson 2021).

Upon aggregating historical data sets, the probability density function (PDF) of the energy released in solar flares is found to be a power law or log-normal over multiple decades (Rosner & Vaiana 1978; Lu & Hamilton 1991; Verbeeck et al. 2019). The PDF of time intervals between successive flares in the same active region, henceforth termed waiting times, is less clear-cut. Wheatland (2000b) found evidence that the power-law-like shape of the tail of the waiting-time PDF found by Boffetta et al. (1999) is explained by a sum of exponentials, with individual rates themselves drawn from an exponential PDF. This interpretation is further developed by Aschwanden et al. (2021). Flaring rates that vary in time are also noted by Lepreti et al. (2001) and Gorobets & Messerotti (2012). Cross-correlations between the size of a flare and the subsequent (preceding) waiting time, henceforth termed forward (backward) cross-correlations, are a key differentiator between the stress–relax and avalanche descriptions: the latter predicts no size–waiting-time cross-correlations (Jensen 1998). Forward and backward correlations are broadly absent from solar flares data sets (Biesecker 1994; Crosby et al. 1998; Hudson et al. 1998; Wheatland 2000a; Hudson 2020), with the exception of strong forward cross-correlations found in two active regions (Hudson 2019).

Rotational glitches in rotation-powered pulsars (Lyne & Graham-Smith 2012; Haskell & Melatos 2015) are an astrophysical phenomenon analogous to solar flares, in the limited sense that they are consistent with a stress–relax process even though they do not involve magnetic reconnection as far as is currently known (Aschwanden et al. 2018). While the exact process that triggers a glitch is unknown, most models are encompassed by the fundamental idea that "stress" (possibly differential rotation or elastic deformation) builds up secularly between glitches and is released sporadically and partially at a glitch. The instantaneous glitch rate is assumed to grow with the stress in the system. This idea is formalized phenomenologically in the state-dependent Poisson (SDP) process popularized by Fulgenzi et al. (2017). Precise, falsifiable predictions about size and waiting-time PDFs, auto-, and cross-correlations, as well as comparisons to current data sets, show the power and flexibility of the SDP model in the neutron star context (Melatos et al. 2018; Carlin & Melatos 2019a, 2019b; Melatos & Drummond 2019). The same falsifiable predictions also deliver new physical insights when applied to solar flare data, as we show in this paper.

Our goal in this paper is to search the Geostationary Operational Environmental Satellite (GOES) soft X-ray flare database for signatures of a threshold-driven stress–relax process. We do this by deaggregating the data from different active regions and studying summary statistics of flare waiting times and sizes. In Section 2, we outline the SDP framework and how it maps to solar flares. In Section 3, we explore various regimes of the SDP process. We also specify precise, falsifiable tests for the question of whether solar flares are triggered once the energy reaches a static threshold, if it obeys a stress–relax process. The GOES soft X-ray data set to which we apply these tests is described in Section 4. In Section 5, we look for associations between matching waiting-time and size PDFs and high size–waiting-time cross-correlations, as predicted for the SDP process. We conclude with a discussion of the microphysical implications of the data analysis in Section 6.

2. State-dependent Poisson Process

2.1. Equation of Motion

The SDP process is a doubly stochastic renewal process that models the "stress" in the system as a function of time, X(t), as

Equation (1)

where X and t are expressed respectively in dimensionless units of Xcr (the critical stress in the system at which a stress-release event becomes certain) and τ (the time taken for the system to accumulate the critical stress Xcr, in the absence of any stress-release events). We attach a physical interpretation to X(t), in the solar flare context, in Section 2.3 and Appendix A. The amount of stress released at the ith event, ΔXi , is a random variable, drawn from a user-specified PDF, $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$, where $X({t}_{i}^{-})$ is the stress in the system immediately prior to the ith event. In the standard configuration, $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$ is fixed as a power law, but other options exist (Carlin & Melatos 2019a, 2021). Making η conditional on $X({t}_{i}^{-})$ is necessary in order to ensure that the stress remains positive-definite and is plausible physically. The second random variable in Equation (1) is N(t), a stochastic function that counts the number of events up to time t. It is determined iteratively via the waiting time between each event.

We assume the instantaneous event rate is a monotonically increasing function of the stress in the system, viz.

Equation (2)

where α = λ0 τ is a dimensionless control parameter and λ0 = λ(X = 1/2)/2 is a reference rate. The long-term statistical output of the SDP process does not depend strongly on the functional form of λ[X(t)], so long as it diverges in the limit X → 1 as in Equation (2) (Fulgenzi et al. 2017; Carlin & Melatos 2019a). Equation (2) implies that the probability of an event occurring approaches one as the stress approaches the critical stress. As the stress increases deterministically between events, the PDF of waiting times Δt following the ith event is that of a variable-rate Poisson process (Cox 1955; Fulgenzi et al. 2017),

Equation (3)

where $X({t}_{i}^{+})$ is the stress immediately following the event at time ti .

2.2. Monte Carlo Automaton

Analytically solving the coupled Equations (1)–(3) to calculate the PDFs of event waiting times or sizes is usually not feasible, except for particular choices of $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right];$ see Section 6 of Fulgenzi et al. (2017) for an example. Instead, it is simple to run the following automaton to generate numerical solutions:

  • 1.  
    Pick Δt from Equation (3), given the current stress X.
  • 2.  
    Update the stress to X + Δt to account for the deterministic evolution.
  • 3.  
    Pick ΔX from $\eta \left[{\rm{\Delta }}X\,| \,X+{\rm{\Delta }}t\right]$, and subtract it from the stress.
  • 4.  
    Repeat from step 1.

Given α and $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$ the automaton generates a time-ordered sequence of waiting times and sizes. From the sequence, we can calculate the long-term PDFs for waiting times and sizes (Fulgenzi et al. 2017; Carlin & Melatos 2019a), as well as the cross-correlation (Melatos et al. 2018; Carlin & Melatos 2019a), autocorrelations (Carlin & Melatos 2019b), and other observables.

2.3. Mapping to Solar Flares

We identify the stress that accumulates between flares and relaxes at a flare with the spatially averaged magnetic energy density in a given active region. One could equally choose a different physical quantity, such as the magnetic shear, depending on the particular microphysics of the flare trigger. We assume that active regions have independent stress reservoirs, i.e., the coronal magnetic fields of different active regions do not interact strongly, and all flares from the same active region extract energy from one reservoir. An idealized toy model relating the foregoing definition of stress to magnetic energy input from the photosphere is outlined in Appendix A, as a simplified but concrete illustration of the physical picture under consideration.

Two major simplifying assumptions are that the rate at which energy is fed into the reservoir, τ−1, is constant in time, and so is the critical, spatially averaged magnetic energy density, Xcr, for a given active region. These assumptions are motivated in Appendix A, but a key goal of the paper is to test their veracity. We also assume that flares reduce X(t) instantaneously. This assumption is defensible, as the typical time between flares (typically hours) is much larger than the typical duration of flaring events (typically minutes) (Fletcher et al. 2011). In what follows, we do not prescribe a particular functional form for $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$, to keep the SDP process as flexible as possible.

The SDP process generates sequences of dimensionless waiting times $\left\{{\rm{\Delta }}{t}_{i}^{\mathrm{SDP}}\right\}$ and sizes $\left\{{\rm{\Delta }}{X}_{i}^{\mathrm{SDP}}\right\}$. It is natural to ask how they correspond to directly observable sequences of waiting times $\left\{{\rm{\Delta }}{t}_{i}^{\mathrm{obs}}\right\}$ and sizes $\left\{{\rm{\Delta }}{s}_{i}\right\}$. The waiting times satisfy

Equation (4)

where τ is unknown a priori but can be estimated as discussed in Appendix B. The sizes are related in a more complicated way, because some energy is released through channels other than the soft X-ray flux. We make the simplifying assumption that the peak soft X-ray flux multiplied by the duration of the flare (henceforth Δsi for the ith flare in a given active region) is proportional to the stress released from the reservoir. This implies

Equation (5)

The unknown constant of proportionality in Equation (5) hampers direct parameter estimation of Xcr from an observed size PDF.

Full parameter estimation for the SDP process given a set of observed waiting times and sizes is discussed in Melatos & Drummond (2019). It lies outside the scope of this paper. For completeness, in Appendix B we also describe an alternative parameter estimation procedure using a hierarchical Bayesian scheme. The advantage of the hierarchical scheme is that it clarifies, as a matter of principle, the information content and flow in the estimation problem, i.e., the parameter combinations that can be estimated uniquely, and the data components that inform each parameter estimate. The disadvantage is that it can be implemented in practice only if $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$ is drawn from a certain class of mathematical functions. The favored class produces waiting-time and size PDFs that do not match solar flare observations (see Section 5), so the hierarchical scheme is not applied to real data in this paper. Nonetheless, it is included for the benefit of the reader, who may wish to develop it further for solar flare analysis in the future.

3. Observable Signatures of a Rapidly Driven Process

Broadly speaking, an SDP process operates in one of two regimes: slowly driven, with α ≫ 1, and rapidly driven, with α ≪ 1. In this section, we identify the key dynamics in both regimes, in Sections 3.1 and 3.2 respectively. We then infer, in Section 3.3, a set of observable signatures that, if absent, rule out the operation of a rapidly driven process of the form described in Section 3.2.

3.1. Slow Driver

When the system is slowly driven, we have X(t) ≪ 1, as events are usually triggered before the stress accumulates to near the threshold. When $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$ is a power law, the predicted waiting-time PDF pt) is an exponential, and the predicted size PDF pX) is a power law over multiple decades. We show the qualitative behavior of the stress in the slowly driven regime in the top left panel of Figure 1, where for the sequence of 20 stress-release events shown, we have 0.1 < X(t) < 0.6. In the bottom left panels of the same figure, we show the predicted waiting-time and size PDFs. The size PDF pX) has a turnover in logarithmic slope at ΔX ≈ 10−2 due to the particular choice of $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$ (see caption for the functional form).

Figure 1.

Figure 1. Top panels: qualitative behavior of the evolution of the stress, X(t), in the SDP process in the two main regimes, i.e., slowly driven (left panel, α = 10) and rapidly driven (right panel, α = 0.1). Bottom panels: predicted waiting-time and size PDFs pt) and pX), respectively, in the above regimes. For all panels, we fix $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]\propto {\left({\rm{\Delta }}{X}_{i}\right)}^{-3/2}H\left[{\rm{\Delta }}{X}_{i}-{10}^{-2}X({t}_{i}^{-})\right]$, where the Heaviside function H(...) enforces a minimum stress-release size of 1% of the stress in the system, ensuring integrability. The histograms in the bottom panels each include N = 107 events.

Standard image High-resolution image

Depending on the choice of $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$, there may be detectable backward cross-correlations (i.e., a correlation between the size of an event and the waiting time since the preceding event). This is because the size of an event cannot exceed the amount of stress in the system, so longer periods of stress accumulation allow for the possibility of larger events. This backward cross-correlation is especially pronounced with the choice $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]\propto \delta \left[{\rm{\Delta }}X-X({t}_{i}^{-})\right]$, where δ(...) is the Dirac-delta function. This choice forces the stress reservoir to empty at each event, and it collapses the SDP process to other stochastic processes in the literature, e.g., forest fire models (Daly & Porporato 2006). In the solar flare context, a backward cross-correlation is a long-standing prediction of the Rosner & Vaiana (1978) stress buildup model, as well as the "reset" model of Hudson et al. (1998), Hudson (2019). If instead $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$ prefers small stress-release events, e.g., if it is a power law (Fulgenzi et al. 2017), the backward cross-correlation is small.

3.2. Fast Driver

When the system is rapidly driven, we have $X({t}_{i}^{-})\lesssim 1$, i.e., the stress is driven close to the threshold before each event. As the dynamics of the system are strongly influenced by the presence of the threshold, we sometimes call this regime "threshold-driven." A prediction of the SDP process in this regime is that the size and waiting-time PDFs should "match," i.e., they should be the same distribution, up to a linear scaling (Fulgenzi et al. 2017). For example, if $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$ is a power law, both the waiting-time and size PDFs are power laws. We show this in the bottom right panels of Figure 1. We also show the qualitative behavior of the stress versus time in this regime in the top right panel of the same figure, where we see X(t) → 1 (i.e., unit critical stress) before every stress-release event.

A strong forward cross-correlation (i.e., a correlation between the size of an event and the waiting time to the next event) is observed in this regime. This is because a large stress-release event results in a longer delay before the system accumulates enough stress to approach the threshold. In the solar flare context, this cross-correlation is predicted by the "saturation" model of Hudson et al. (1998) and Hudson (2019). The smooth transition between the fast-driven regime (α ≪ 1), with high forward cross-correlations, and the slow-driven regime (α ≫ 1), is shown in Figure 2.

Figure 2.

Figure 2. Forward, ρ+ (solid curve), and backward, ρ (dashed curve), cross-correlations between event sizes and waiting times for 102 values of 10−2α ≤ 102, with N = 105 events generated per value of α. We fix $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$ to the same functional form as in Figure 1. A version of this figure first appeared as Figure 13 in Fulgenzi et al. (2017).

Standard image High-resolution image

3.3. Heuristic Test for a Threshold-driven Process

Ideally, we fit the parameters of the SDP process (τ, Xcr, λ0, and $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$) to a paired sequence of solar flare sizes and waiting times, and then infer what regime applies. In practice, however, at least some of these parameters vary between active regions, which limits us to sequences of length N ≲ 50 (typical active regions have at most dozens of events; see Section 4), which are insufficient to infer four or more parameters. Instead, we combine the results in Sections 3.1 and 3.2 to deduce qualitative features, which must be present in multiple observables simultaneously (e.g., PDFs and cross-correlations), if the SDP process operates in a particular α regime—or indeed operates at all. Specifically, we summarize the behavior in Sections 3.1 and 3.2 into the following observationally testable prediction. If solar flares are well-modeled with a rapidly driven process (e.g., the SDP process with α ≪ 1), then we should see large forward cross-correlations, accompanied by waiting-time and size PDFs with the same shape in individual active regions. This coincident signature should become more prominent as α (or a proxy thereof) decreases. We quantify this prediction in Section 5.2.

With the test above, we also ameliorate the impact of a potentially mis-specified model. Although it is agnostic about the microphysics, the SDP process is not the only possible prescription of stress accumulation and release. If some of the assumptions outlined in Section 2.3 do not hold, a different underlying model may underpin solar flares. For example, one may build a model in which stress accumulates according to a Brownian random walk with some underlying drift, until a threshold is breached, at which point a stress-release event is triggered (Carlin & Melatos 2020). In the Brownian stress-accumulation model, X(t) is always driven to the threshold before each event. If drift occurs faster than diffusion, we would predict the same coincident signature as for the SDP process: large forward cross-correlations are accompanied by matching waiting-time and size PDFs.

4. GOES Soft X-Ray Observations

The X-ray Sensor (XRS) on the Geostationary Operational Environmental Satellites (GOES) has continuously monitored the Sun in soft X-rays (0.5–8 Å) since 1975. In Section 4.1, we introduce the flare summary data analyzed in this paper. In Section 4.2, we remind the reader of the obscuration effect described in Wheatland (2001), explain how it affects our analysis, and briefly touch on other biases the GOES flare detection algorithm may have on the completeness of the catalog as a whole. In Section 4.3, we analyze aggregated waiting-time and size PDFs across all active regions.

4.1. Flare Data

We use the publicly available flare summary data hosted by the National Geophysical Data Center (NGDC) 4 for flares before 2015 June 29, and by the Space Weather Prediction Center (SWPC) 5 for flares after 2015 June 28. We combine these two data sources into a homogeneous, cleaned database (henceforth "catalog"), with some anomalies corrected as described in Appendix C. These data are collated with the flare start epochs, ts, peak epochs, tp, and end epochs, te. The peak epoch is defined as the epoch that contains the highest peak flux, after the start epoch. The end epoch is defined as the epoch at which the flux returns to half of the difference between the peak flux and the background flux. The peak flux (irradiance) of each flare, fp, is calculated with reference to the recorded flare class, in units of W m−2. The longitude and latitude of the flare are also often available, if the flare is associated with an active region. For clarity, we henceforth denote as xi,k an arbitrary measurement of the variable x in the ith flare from the kth active region.

We define the waiting time between two flare epochs as ${\rm{\Delta }}{t}_{i,k}={t}_{i+1,k}^{{\rm{p}}}-{t}_{i,k}^{{\rm{p}}}$. One could equally use the flare start or end epochs to define the waiting time. We use the flare peak epoch because it is less influenced by the background level, as discussed further in Section 4.2. We define the flare size as ${\rm{\Delta }}{s}_{i,k}={f}_{i,k}^{{\rm{p}}}\left({t}_{i,k}^{{\rm{e}}}-{t}_{i,k}^{{\rm{s}}}\right);$ that is, we multiply the irradiance by the duration of the flare to obtain a flare size with dimensions of energy per unit area. The duration of an active region is ${\rm{\Delta }}{T}_{k}={t}_{{N}_{k},k}^{{\rm{e}}}-{t}_{1,k}^{{\rm{s}}}$, where there are Nk flares in the region. The average flare rate in an active region is λk = Nk Tk . The average λk systematically overestimates the true flare rate, as the duration ΔTk is between two flare epochs, both of which are counted in Nk . An unbiased estimate would take ΔTk as the difference between the epochs when the active region appears and disappears, but neither epoch is recorded in the GOES flare summary data. 6 A concern we touch on further in Section 4.3 is that λk = Nk Tk assumes that the flare catalog is complete, i.e., all flares from a given region are detected and correctly attributed to that region.

As of 2022 September 1, there are 83,283 flares in the catalog, 49,328 of which are associated with an active region. There are 2429 active regions with Nk ≥ 5, accounting for 42982 flares. The median number of flares in an active region (when considering only regions with Nk ≥ 5) is 12, but the average is 17.7, as the distribution is peaked at Nk =5 and monotonically decreases with Nk . We show the probability mass function of the number of flares in an active region, Pr(Nk ), in the top panel of Figure 3 for regions with Nk ≥ 5. The PDF of observed flare rates, p(λk ), across all active regions with Nk ≥ 5, is displayed in the bottom panel of the same figure. The distribution of λk is well-described by a log-normal, with mean 0.08 hr−1, and standard deviation 0.7 hr−1. Henceforth, we typically only include regions with Nk ≥ 5 in our analysis, unless stated otherwise, as many of the subsequent statistical tests, such as the cross-correlation(s), have low statistical power with smaller sample sizes.

Figure 3.

Figure 3. Top panel: Probability mass function of the number of flares per active region, Nk , for all active regions with Nk ≥ 5. Regions with 0 ≤ Nk ≤ 4 are excluded, as the number of samples per region is too small for the statistical tests in this paper. Bottom panel: PDF of the flare rate, λk , for all active regions with Nk ≥ 5, binned into 50 uniformly spaced bins between 0 hr−1 and 0.5 hr−1.

Standard image High-resolution image

4.2. Obscuration and Flare Size Bias of Subsequent Flares after a Large Flare

As described in Section 2.2 of Wheatland (2001), the GOES flare detection algorithm involves a selection effect that obscures the detection of flares following a large flare, due to the enhanced background soft X-ray flux. To detect a flare, the flux must monotonically increase for four consecutive minutes, with the last value being 1.4 times the value from three minutes earlier. Hence, a flare must produce a 40% increase above the background flux. However, while flares typically rise rapidly to peak flux, the soft X-ray emission is observed to decay on a timescale of hours (Benz 2016). Thus, even large flares may be obscured by the enhanced background flux following, say, an X1 (fp = 10−4 Wm−2) class flare.

The above detection algorithm introduces a secondary bias in the catalog. A flare, say with recorded peak flux ${f}_{i}^{{\rm{p}}}$, that occurs during the decay of another flare, say with recorded peak flux ${f}_{i-1}^{{\rm{p}}}$, has an enhanced peak flux compared to the case of the same flare occurring during a period of low background flux, viz.

Equation (6)

where τ is the timescale on which the peak flux from the (i − 1)th flare decays, and ${f}_{i,{\rm{true}}}^{{\rm{p}}}$ is the "true" peak flux of the ith flare. A secondary effect is that the duration of the ith flare will have a reduced duration, due to the decaying contribution of the previous flare to the background flux. That is, during the ith flare, the background flux decays by a factor of $\exp \left[-\left({t}_{i}^{{\rm{e}}}-{t}_{i}^{{\rm{s}}}\right)/\tau \right]$, which reduces the time taken for the flux to decay back to half the difference between the peak and pre-flare fluxes. These two effects run counter to one another, as Δsi is defined as the product of peak flux and duration. They are likely present in a small subset of the catalog, as only 1.2% of all flares from regions with Nk ≥ 5 have start times before the end time of the previous flare.

A comprehensive reanalysis of the soft X-ray flux time series, with appropriate background subtraction, may reveal flares that were not detected with the original flare detection algorithm and correct the biases in flare size outlined above. Such a reanalysis lies outside the scope of this paper. Acknowledging that the GOES catalog is incomplete, we mitigate the impact on our analysis by creating a secondary masked catalog that only includes flares with Δs ≥ 10−3 Jm−2 (i.e., class C1 multiplied by the median duration of 103 s and higher), where we know the fraction of missed flares is smaller. This masked catalog is used in Sections 4.3 and 5.1 when we perform comparative studies and parametric fits for the waiting-time and size PDFs.

4.3. Aggregate Size and Waiting-time Statistics

When we aggregate flare waiting times and sizes across all active regions with Nk ≥ 5, we obtain the PDFs shown in the top and bottom panels of Figure 4, respectively. The data are shown as the black histograms. The gray region in the bottom panel shows the flare sizes that are not included in the masked catalog, as described in Section 4.2. The difference between the masked and full catalog is not visible in the top panel, as the effect on pt) is minimal.

Figure 4.

Figure 4. Top panel: PDF pt) of waiting times (thick black stepped curve), Δt, aggregated over all active regions with Nk ≥ 5, binned into 50 linearly spaced bins between the minimum and maximum Δt in the full, unmasked catalog. Overlaid are the best-fit estimates of an exponential (blue dashed curve), power-law (orange dotted curve), and log-normal (green dotted–dashed curve) PDF. Bottom panel: As for top panel, but for sizes Δs, and the data are binned into 50 logarithmically spaced bins. The gray region in the bottom panel shows the difference between the size PDFs from the masked and full catalogs (see Section 4.2).

Standard image High-resolution image

Three empirical trial distributions with common analytic forms are overlaid on each PDF for the masked catalog. The overlaid distributions have parameters fixed to their maximum likelihood values, given the data. By eye, it is clear that the log-normal distribution best describes the aggregated waiting-time PDF 7 , pt), while a power-law distribution best describes the aggregated size PDF, ps). We formalize this comparison using the corrected Akaike Information Criterion (AICc; Akaike 1974; Hurvich & Tsai 1989). The AICc calculates the model, among a set of possible models, that minimizes the information loss, while accounting for potential bias due to the number of model parameters and the sample size. When calculated for pt), the relative probability for a log-normal describing the data over an exponential or a power law is ${e}^{3\times {10}^{3}}$ or ${e}^{3\times {10}^{4}}$, respectively. When calculated for ps), the relative probability for a power law describing the data over an exponential or a log-normal is ${e}^{3\times {10}^{4}}$ or ${e}^{4\times {10}^{3}}$, respectively. If we instead only mask flares with peak flux less than 10−6 Wm−2 (i.e., class C1), we find that a log-normal best describes the data. This preference is also noted in Verbeeck et al. (2019). It arises because of the small number of flares in this alternatively masked catalog with 10−4 < Δs/Jm−2 ≲ 10−3. If we fit the full, unmasked catalog (i.e., include the flares shaded in gray in the bottom panel of Figure 4), we find that ps) is again fitted best with a log-normal. We remind the reader that ps) is an observed distribution; it is a function of both the underlying generative physics (i.e., how much energy is released in each flare) and the systematic observational biases (i.e., how many and what flares are detected or not). The masking described in Section 4.2 is a first pass at accounting for some of these biases.

If we assume that the same α and $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$ apply to all regions, Figure 4 is inconsistent with the SDP framework for any choice of α and $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$. However, as we discuss in Section 2.3, there is good reason to believe that α (and perhaps even $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$) may vary from region to region.

5. Disaggregated Data in Individual Active Regions

The goal of this section is to search for signatures of a threshold-driven SDP process in individual active regions, rather than considering all flares in aggregate. In Section 5.1, we consider only waiting-time and size PDFs, without regard to their potential cross-correlation. In Section 5.2, we calculate these cross-correlations, and in Section 5.3, we apply the test outlined in Section 3.3 by searching for an association between matching waiting-time and size PDFs and the cross-correlation, in individual active regions. In Section 5.4, we perform a preliminary investigation of the longer-term memory in the system by calculating the autocorrelation between subsequent waiting times and subsequent sizes. The analysis in Section 5.1 involves parametric fitting of the size PDFs, so we use the masked catalog described in Section 4.2. However, in Sections 5.25.4, we use the full, unmasked catalog, as the tests performed in these sections are nonparametric.

5.1. Waiting-time and Size PDFs versus Flare Rate

For regions with Nk ≥ 5, we disaggregate the data and ask what PDF shape best characterizes each region's waiting-time and size PDFs, rather than considering only the aggregated data set as in Section 4.3. Using the AICc, we find that, for waiting-time PDFs, 63% of regions are fitted best with an exponential distribution, 20% with a log-normal, and the remainder with a power law. For size PDFs, 63% of regions are fitted best with a power-law distribution, 24% with an exponential, and the remainder with a log-normal. These proportions are broadly consistent with previously published results, i.e., while the aggregated pt) is fitted best with a log-normal, individual regions are often fitted best with an exponential (with varying rates) (Wheatland 2000b). Individual regions have ps) that is usually fitted best with a power law, but can occasionally be better represented with log-normal or exponential distributions, especially in regions with lower Nk . As an example of the different shapes that distributions of waiting times and sizes can have in different regions, we display four arbitrary but representative per-active-region fits in Appendix E.

One may reasonably ask whether there are clear trends, or predictors, for the waiting-time and size PDFs in any given region. An exhaustive search for predictors, e.g., with a multiple regression analysis, lies outside the scope of this paper, but one sensible first step is to see if the shape that fits best evolves with λk . This test is performed for the waiting-time and size PDFs in the top and bottom panels of Figure 5, respectively. This figure is constructed by binning all active regions with Nk ≥ 5 into 20 uniformly spaced bins between λk = 0.025 hr−1 and λk = 0.3 hr−1, then calculating which distribution fits best the waiting times and sizes using the AICc. For the waiting-time PDFs, the proportion of regions fitted best with a power law stays roughly constant as λk increases, at around 15%, while the proportion of regions fitted best with a log-normal grows with λk , at the expense of the exponential. For the size PDFs, we see a similar trend, with the proportion of regions fitted best with a power law staying roughly constant as λk increases, this time at around 65%, while the proportion of regions fitted best with a log-normal grows with λk , at the expense of the exponential. In both panels, we plot the cumulative distribution function (CDF) of λk to remind the reader that each λk bin does not contain the same number of regions; 90% of regions with Nk ≥ 5 have 0.028 < λk /(1 hr−1) < 0.27.

Figure 5.

Figure 5. Top panel: Proportion of active regions with waiting-time PDF fitted best by three possible shapes (a power law, an exponential, and a log-normal) as determined by the AICc, as a function of the region's flare rate, λk . The CDF of λk is overlaid in black. Bottom panel: As in the top panel, but for size PDFs.

Standard image High-resolution image

Can the evolution of the proportion of each shape versus λk be explained with the SDP model? Under the assumption that λk is a tracer of the driving rate, i.e., λk α−1, we expect to see a greater proportion of exponentially distributed waiting times at low λk (high α). This is the regime in which the stress does not approach the threshold at X = Xcr before each event, so waiting times are uncorrelated with sizes and are (broadly) Poissonian, i.e., the waiting times are exponentially distributed. This expectation broadly conforms with what we see in the top panel of Figure 5.

The evolution versus λk in the bottom panel of Figure 5 is harder to explain with the SDP model, if each region has the same $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$. We expect ps) ∝ η, when α is low, but we see all of the three possible shapes represented at the highest values of λk . This implies at least one of the following: (i) η truly varies from one region to the next, which implies different stress-release mechanisms are at play in different regions; or (ii) λk ≳ 0.2 hr−1 does not correspond to α ≪ 1, and hence ps) ∝ η; or (iii) the small sample size of events in each region results in the AICc not favoring the "true" size distribution; or (iv) the obscuration effects described in Section 4.2 is stronger with higher λk , due to the enhanced background flux in regions that have many flares in a short period of time. To test (iii), we generate Figure 5 again for regions with Nk ≥ 10, instead of Nk ≥ 5. For both the waiting times and sizes, the proportion fitted best by an exponential drops by ∼10% in each λk bin, while the proportion fitted best by a log-normal increases. Qualitatively, however, the evolution with λk remains consistent with what is seen in Figure 5.

5.2. Size–Waiting-time Cross-correlations

The correlation between flare sizes and subsequent waiting times, i.e., the "forward" cross-correlation, is denoted as ρ+,k , while the correlation between flare sizes and the preceding waiting times, i.e., the "backward" cross-correlation, is denoted as ρ−,k . We calculate these correlations using the Spearman correlation coefficient (Lehmann & D'Abrera 2006).

The PDFs of forward and backward cross-correlations measured in all active regions with Nk ≥ 5 are displayed in the top panel of Figure 6. The PDFs are broad, mostly because the median Nk is 12 (i.e., small). However, p(ρ+,k ) and p(ρ−,k ) are different; the latter has a median of ρ−,k = 0.0, while the former has a median of ρ+,k = 0.08. A Kolmogorov–Smirnov (KS) two-sample test provides quantitative evidence of the difference, returning a p-value of 10−14.

Figure 6.

Figure 6. Top panel: PDF of forward cross-correlation, ρ+,k (black stepped curve), and backward cross-correlation, ρ−,k (black dashed curve), for all active regions with Nk ≥ 5. Data are binned into 30 uniformly spaced bins between ρ±,k = −1 and ρ±,k = 1. Bottom panel: Scatter plot of the forward cross-correlation, ρ+,k , and flare rate, λk , for all regions with Nk ≥ 10. The dashed gray line corresponds to ρ+,k = 0.

Standard image High-resolution image

The SDP process predicts a high forward cross-correlation for α ≪ 1, as discussed in Section 3.2. The data do not show evidence in favor of such a correlation in the majority of active regions. In the bottom panel of Figure 6, we do not see a clear visual trend between ρ+,k and λk (for regions with Nk ≥ 10), although a Spearman correlation test returns a small but nonzero correlation of 5 × 10−2 (p-value of 0.06). We remind the reader that both ρ+,k and λk are empirical estimates for each active region, and the number of events per region is small. We do not plot uncertainties in the bottom panel of Figure 6, but they are typically large, on the order of the magnitude of the central estimate. These results imply that either (i) all active regions have α ≳ 1, where no forward cross-correlation is expected, i.e., λk ≳ 0.2 hr−1 does not correspond to the threshold-limited regime; or (ii) some active regions have α ≪ 1, but either the trigger threshold or the driving rate is not constant with time, i.e., the random process triggering flares does not conform to the assumptions made in the SDP framework.

As described in Section 4.2, the GOES catalog is not complete, i.e., while a given active region is visible, not all flares that occur are recorded. Wheatland (2000a) noted that the obscuration in Section 4.2 creates an artificial forward cross-correlation, which explains why the median ρ+,k is positive.

5.3. Matching the Shapes of pt) and ps)

Another way to probe these data is to ask whether regions with relatively high ρ+,k exhibit "matching" functional forms for pt) and ps), as the SDP model predicts for α ≲ 1. We quantify the degree to which the PDFs match via the p-value ${{ \mathcal M }}_{k}$ of the KS two-sample test applied to the sampled PDFs, {Δti,k } and {Δsi,k }. To calibrate, we perform a Monte Carlo simulation using sequences of events drawn from the SDP model. For each value of α, we simulate 105 "regions". For each region, we generate M events, where M is a random number drawn from p(Nk ), restricted to values Nk ≥ 10, as empirically measured for the GOES regions (the distribution for Nk ≥ 5 shown in Figure 3). From these M events, we calculate summary statistics, e.g., ${ \mathcal M }$ and ρ+. This results in 105 pairs of $({ \mathcal M },{\rho }_{+})$, from which we construct a two-dimensional kernel density estimate (KDE; Wand & Jones 1995). The KDE estimates the true joint PDF $p({ \mathcal M },{\rho }_{+})$, and it is qualitatively equivalent to a smoothed, two-dimensional histogram.

The result of this procedure for four values of α is shown in the top panel of Figure 7. For α = 0.01 (dark orange lines), the 10% and 50% credible interval contours are invisible in the top right corner of the panel, as the vast majority of simulated regions have ρ+ ≈ 1 and ${ \mathcal M }\approx 1$. For α = 0.1, the KDE spreads out slightly, with the 50% credible interval reaching ρ+ ≈ 0.8 and ${ \mathcal M }\approx 0.9$. For α = 1, the PDF shifts such that the 50% credible interval contour stretches from ${ \mathcal M }=1$ to ${ \mathcal M }\approx 0.8$, while we have 0 ≲ ρ+ ≲ 0.8. For α = 10, we find ρ+ centered around zero, while the 50% credible interval of ${ \mathcal M }$ extends to ${ \mathcal M }\approx 0.3$. The bottom panel of Figure 7 repeats the exercise for the backward cross-correlation. It confirms that ρ and ${ \mathcal M }$ are uncorrelated, as predicted elsewhere (Fulgenzi et al.2017; Carlin & Melatos 2019a, 2019b). Integrating over ρ±, the reader would find that the marginal distribution of ${ \mathcal M }$ is broad in both panels, for α ≳ 1, as it is difficult to confidently reject the null hypothesis that two sets of samples are from the same distribution, when dealing with small sample sizes.

Figure 7.

Figure 7. Calibrating the cross-correlation test in Section 5.3: two-dimensional KDEs of the relationship between ρ+ and ${ \mathcal M }$ (top panel), and between ρ and ${ \mathcal M }$ (bottom panel), for events simulated from the SDP model with four values of α (see legend for color code). The text in Section 5.3 reports details about the Monte Carlo procedure. Contours correspond to the 10%, 50%, and 90% credible intervals.

Standard image High-resolution image

Monte Carlo calibration in hand, we present the equivalent data for all regions in the GOES catalog with Nk ≥ 10 in Figure 8. To compute ${{ \mathcal M }}_{k}$ for each region, we first normalize both the waiting times and the sizes by their respective means for that region, before applying the KS two-sample test. In the top (bottom) panel, we see no clear relationship between ρ+,k (ρ−,k ) and ${{ \mathcal M }}_{k}$. Marginalizing over ${{ \mathcal M }}_{k}$, we recover the PDFs of ρ+,k and ρ−,k , shown in the top panel of Figure 6. If, for the sake of argument, we assume that all regions have the same value of α, we can compare the KDEs in Figure 8 to those in Figure 7. Under this assumption, we are pushed into the regime 1 ≲ α ≲ 10, as the $p({\rho }_{+,k},{{ \mathcal M }}_{k})$ KDE is slightly offset from the horizontal axis (i.e., median ρ+,k > 0), but the 50% credible interval for ${{ \mathcal M }}_{k}$ only extends to ${{ \mathcal M }}_{k}\lesssim 0.6$. Even if α is not exactly the same in each region, we can interpret the above result as ruling out the notion that a large proportion of regions have α ≲ 1, as if that were the case we would see higher values of ${{ \mathcal M }}_{k}$ associated with higher ρ+,k more often than in Figure 8.

Figure 8.

Figure 8. As for Figure 7, but for GOES data, with black crosses marking all regions in the full, unmasked GOES catalog with Nk ≥ 10. Red contours correspond to the 10%, 50%, and 90% credible intervals inferred from the black crosses.

Standard image High-resolution image

5.4. Autocorrelations and Longer-term Memory

While there is a wealth of information available in flare waiting-time and size PDFs, as in Section 5.1, and cross-correlations, as in Sections 5.2 and 5.3, one may also consider statistics that quantify the longer-term memory of the stress in the system. For example, as in the context of neutron star glitches (Carlin & Melatos 2019b), studying the autocorrelation between consecutive waiting times, ρΔt , or between consecutive sizes, ρΔs , allows one to place constraints on applicable SDP model parameters. In the fast-driven regime, α ≪ 1, we predict ρΔt = 0 and ρΔs = 0, as we have X(t) → Xcr before every stress-release event, resulting in the system resetting at every event. If we have $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]\propto \delta [X({t}_{i}^{-})]$, i.e., all stress in the system is released at each event, the same prediction of no autocorrelations holds—again the system is reset at every event. Observations of nonzero autocorrelations therefore rule out certain regimes in the SDP framework.

When we calculate ρΔt,k and ρΔs,k for all regions in the GOES catalog with Nk ≥ 10, we find that the PDFs p(ρΔt,k ) and p(ρΔs,k ) are broad, akin to the p(ρ+,k ) shown in Figure 6. The medians are 0.11 and 0.10, respectively. This result is incongruent with the assumption α ≪ 1 in all regions, as α ≪ 1 implies both PDFs should have a median of zero. These results likely stem from the obscuration effects described in Section 4.2. One tentative physical interpretation of a positive ρΔt and ρΔs is via analogy to terrestrial earthquakes, which exhibit aftershocks, i.e., large events are often followed by larger-than-average events, with smaller-than-average waiting times (Utsu & Ogata 1995).

6. Conclusion

In this paper, we revisit the long-standing question of whether solar flares in different active regions are triggered by a stress–relax process with a trigger threshold that is constant in time. We do so by mapping the flaring process to an SDP process, which operates independently in each active region. The SDP framework has been applied in related contexts to forest fires (Daly & Porporato 2006) and solar flares (Wheatland 2008), before being extended in the context of neutron star glitches (Fulgenzi et al. 2017; Melatos et al. 2018; Carlin & Melatos 2019a, 2019b). If one assumes a constant driving rate, and a constant "stress" threshold at which relaxation events are guaranteed to occur, the model makes precise, falsifiable predictions regarding the statistics of sequences of waiting times and sizes, such as their PDFs and cross-correlations. It is agnostic about the underlying microphysical mechanism.

Analyzing the historical GOES soft X-ray flare catalog, using data from 1975 until 2022, we systematically search all active regions for signatures that flares are consistent with the SDP model with α ≪ 1 (rapidly driven). We find no evidence that this is the case. Specifically, when considering just the waiting-time and size PDFs of each active region, we find that either (i) the conditional PDF of stress-release sizes, $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$, varies from one region to another, or (ii) a flare rate of λk ≳ 0.2 hr−1 does not correspond to α ≪ 1 (assuming that λk traces α). The analysis takes into account the selection effect that obscures the detection of flares following a large flare due to enhancement of the soft X-ray background.

On the other hand, if many active regions house an SDP process driven rapidly toward a static-in-time threshold before each event, i.e., have α ≪ 1, then those regions that have large cross-correlations between event sizes and subsequent waiting times should also have waiting-time and size PDFs of the same shape. This prediction is not supported by the data, as proved clearly by Figure 8. The match between pt) and ps), quantified by ${{ \mathcal M }}_{k}$ (the p-value from a KS two-sample test) for the kth active region, does not correlate with the forward cross-correlation, ρ+,k . If we assume each region has the same α, this test implies 1 ≲ α ≲ 10.

We emphasize that the prediction that ${{ \mathcal M }}_{k}$ should correlate with ρ+,k for a process driven to a threshold does not rely on the specific details of the SDP framework, such as the relationship between flare rate and stress in Equation (2), nor the functional form of $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$. Any stochastic process that drives the stress toward a static threshold before each event would predict an equivalent observable—for example, the Brownian stress-accumulation model (Carlin & Melatos 2020).

There are many ways to interpret the results in Sections 4.3 and 5: (i) solar flares may not be triggered by stress (e.g., magnetic energy density) breaching a threshold; a completely different process may trigger a flare. (ii) The GOES flare catalog is incomplete; flares that occur in the aftermath of large flares are not recorded, creating an artificial cross-correlation. (iii) The threshold at which a glitch is triggered and/or (iv) the driving rate at which stress accumulates may vary with time; either of options (iii) and (iv) could wash out the observable signature. Untangling these explanations entails exploiting the entire wealth of data available in the flare catalog(s), rather than focusing on individual special flares or regions.

In closing, we touch on the following question: if option (iii) above is true, is a threshold that varies with time compatible with any plausible microphysical flare triggers? The question falls outside the scope of this paper, which focuses on the microphysics-agnostic analysis in Sections 4 and 5, so we limit ourselves to the following brief remark. Magnetohydrodynamic instabilities relevant to solar flare activity are triggered above a threshold, which typically depends on the detailed geometry of a flaring magnetic loop, not just its bulk properties (e.g., magnetic energy density). The geometry of a flaring loop does vary with time in general. For example, the ideal kink instability occurs when the total twist Φ = lBϕ (r)/[rBz (r)] satisfies Φ > Φcr(a) ∼ 10, where l and r are the length and minor radius of the current-carrying loop, Bϕ (r) and Bz (r) are the toroidal and axial magnetic field components, and a is the loop aspect ratio (Török et al. 2004). As subphotospheric turbulence perturbs randomly the footpoints of a magnetic flux tube, the variables a and hence Φcr(a) fluctuate stochastically. Quantifying the amplitude (drift and diffusion) of the fluctuations is a task for detailed magnetohydrodynamic simulations and lies well outside the scope of this paper, but it is conceivable that the amplitude is sufficiently large to render option (iii) above viable.

We acknowledge helpful discussions with Kai Yang in the early stages of this paper. J.B.C., A.M., and M.S.W. are supported by the Australian Research Council (ARC) Centre of Excellence for Gravitational Wave Discovery (OzGrav) (project number CE170100004) and ARC Discovery Project DP220102201. J.B.C. is supported by an Australian Postgraduate Award.

Software: Numpy (Harris et al. 2020), Scipy (Virtanen et al. 2020), Matplotlib (Hunter 2007), and Stan (Stan Development Team 2022) through the cmdstanpy interface.

Appendix A: Magnetic Energy Density as a Stress Variable in an Active Region

The SDP framework described in Sections 2 and 3 is agnostic about the microphysics of solar flares, beyond the general assumption that the flaring rate increases with the stress in the system and diverges at a stress threshold. In this appendix, we sketch briefly one possible mapping between the SDP model and a specific microphysical flare model, in which the stress variable is the spatially averaged magnetic energy density in an active region. We emphasize that the model is phenomenological and highly idealized. We do not favor it over the many alternatives; it is merely one possible illustration of how such a mapping may work, as a guide to the interested reader.

Let X(t) correspond to the spatially averaged magnetic energy density in an active region. The spatial average is taken in order to package the stress into a single variable, noting that the magnetic energy density is nonuniform in reality. Let the active region have characteristic linear dimension L. Let S be the magnetohydrodynamic Poynting flux through the photosphere. We assume that the Poynting flux deposits magnetic energy into the active region without losses and at a constant rate, so that one has $X(t)=X({t}_{i}^{+})+{St}/L$ between flares (${t}_{i}^{+}\leqslant t\leqslant {t}_{i+1}^{-}$). Recent vector magnetogram measurements allow direct observation of the Poynting flux and the magnetic energy density of an active region, which are at odds with our assumption that S is steady in time (Fisher et al. 2012; Sun et al.2012, 2017; Sahu et al. 2023). Putting aside the latter consideration for the moment, we can write the control parameter α as

Equation (A1)

where Xcr is the magnetic energy density threshold for a magnetohydrodynamic instability, for example, and λ0 is the instability's trigger reference rate.

Several plausible magnetohydrodynamic instabilities have been suggested as solar flare triggers in the literature (Ji & Daughton 2011). Some of them do not involve a magnetic energy density threshold at all. For example, the kink instability discussed at the end of Section 6 is triggered when the field-line twist Φ exceeds a threshold Φcr(a), whose value depends on the aspect ratio a of the flaring loop (Török et al. 2004). However, instabilities triggered by a magnetic energy density threshold do exist. One example is plasmoid-induced magnetic reconnection via tearing modes in a fractal current sheet (Shibata & Tanuma 2001), whose threshold depends on fractional powers of the Alfvén speed (or equivalently, on the Lundquist number) and hence on the magnetic energy density; see, e.g., Ji & Daughton (2011) or Section 5 in Shibata & Tanuma (2001). In the latter reference, the threshold condition also depends on the aspect ratio a (width divided by length) of the current sheet, which can vary with time, as a flaring loop responds to subphotospheric turbulence.

What are the timescales on which S and Xcr vary? In the SDP picture, both variables are steady, but the GOES analysis in Section 5 implies that one or both may vary in reality (although other scenarios are possible too, as discussed in Section 6). As far as S is concerned, one expects to find statistical fluctuations on the eddy turnover timescale of subphotospheric turbulence and flux emergence, τflux, as observed with vector magnetograms and Doppler measurements (Fisher et al. 2012). Magnetohydrodynamic simulations and G-band radiative signatures suggest τflux ∼ minutes for magnetic features associated with solar granulation, with peak-to-peak fluctuation amplitude ≲30% (Shelyag et al. 2011a, 2011b; Keys et al. 2011). As far as Xcr is concerned, in the plasmoid-induced reconnection picture as one illustrative example, the magnetic energy density threshold for a Sweet–Parker current sheet to undergo secondary tearing depends not only on the magnetic diffusivity, which fluctuates on the same timescale as the local temperature, but also on a and L, which fluctuate on the turnover timescale τflux, e.g., Equation (15) in (Shibata & Tanuma 2001). Flare waiting times are typically comparable to or longer than τflux, so it is conceivable that the constant-Xcr approximation in the SDP theory does not apply to every active region.

We emphasize again that the mapping in this appendix is idealized and illustrative. None of the statistical analysis in Sections 4.3 and 5 is predicated on the microphysics in this appendix.

Appendix B: Hierarchical Bayesian Framework

This appendix lays out a complementary approach to the heuristic tests with which we analyze individual active regions in Section 5. We first write down the likelihood for a set of observed waiting times and sizes in an individual region, given a set of model parameters, in Appendix B.1. We introduce the hierarchical Bayesian framework, which combines inference in different regions to estimate population-level parameters, in Appendix B.2. In Appendix B.3, we test the efficacy of this approach with synthetic data and explain why it is not appropriate in this paper to apply this framework to the real GOES catalog, despite its efficacy. Finally, in Appendix B.4, we propose an alternative approach using only the average waiting time in each region, as a motivation for future studies. The recipes in this appendix are included for completeness and as a starting point for readers who wish to further develop hierarchical methods of solar flare analysis.

B.1. Likelihood and Bayes' Theorem

Solving Equations (1)–(3) for the long-term observable PDFs, pt) and pX), is analytically intractable for most choices of $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$. However, for the special case

Equation (B1)

we obtain the analytic result

Equation (B2)

where the variable z is either Δt or ΔX, i.e., the waiting-time and size PDFs are identical (Fulgenzi et al. 2017). In Equation (B1), we have δ > 0 and the constant of proportionality is set by the condition that η must integrate to unity between ΔXi = 0 and ${\rm{\Delta }}{X}_{i}=X({t}_{i}^{-});$ see Section 6 and Appendix D of Fulgenzi et al. (2017) for a full derivation. Equation (B1) is a monotonically decreasing function of ΔXi , i.e., small stress-release events are preferred over large events. Its specific functional form is reasonable but arbitrary; it is not inferred from solar flare data. We adopt it here as a pedagogical device to illustrate the advantages and disadvantages of a hierarchical Bayesian approach to analyzing flare data.

When we restore the dimensions to Equation (B2), and consider the set of observations in the kth active region, Dk = {Δti,k , Δsi,k }, with 1 ≤ iNk , we can write the likelihood

Equation (B3)

Equation (B4)

where ξk is the constant of proportionality in Equation (5), which translates the drop in magnetic energy density ΔX to the observed flare size, Δs, and one has βk = λ0,k τk + δk . While there are Nk flare sizes in the region, there are only Nk − 1 waiting times. The three model parameters θ k = {τk , ξk , βk } are assumed constant in time for the lifetime of the active region. Bayes' theorem calculates the posterior probability distribution (henceforth "posterior"), p( θ k Dk ), i.e., the probability density of parameter vector θ k given the data Dk , using the prior probability distribution (henceforth "prior"), π( θ k ), viz.

Equation (B5)

The normalizing constant of proportionality is often called the evidence, and while essential for model comparison, it is not relevant for the parameter estimation exercise below (Gelman et al. 2013).

B.2. Population-level Parameter Estimation

Suppose, for the sake of illustration, that Xcr, λ0, and every component of θ k are approximately the same in all active regions. In the language of hierarchical Bayesian inference, this corresponds to assuming that the value of θ k for each region is a random number drawn from a population-level distribution known as a "hyperprior," with narrow extent. For concreteness, we assume the hyperprior for each model parameter is a Gaussian with mean and standard deviation μa and σa respectively, with a ∈ {τ, ξ, β}. While βk does depend on τk , the inference remains accurate so long as the covariance between τk , λ0,k , and δk is minimal. The marginal posterior distribution for the parameters describing the hyperpriors is calculated as

Equation (B6)

with Λ = {μa , σa }. In Equation (B6), π(Λ) is the prior on Λ, and ${ \mathcal D }=\{{D}_{1},\,\ldots ,\,{D}_{k},\,\ldots \,{D}_{M}\}$ is the set of all data, Dk , from M different active regions.

In practice, we can use a Monte Carlo sampler to estimate Equation (B6), and thus infer the posterior predictive distributions, p( θ ), where we drop the subscript k to signify that these distributions bring together information from all active regions to inform the inference. We opt to sample using a Hamiltonian Monte Carlo No U-Turn Sampler (Betancourt 2018), as implemented in the Stan programming language (Stan Development Team 2022).

B.3. Validation with Synthetic Data

To estimate the efficacy of the scheme in Appendix B.2 to infer population-level parameters, we first apply it to a set of synthetic data generated directly from the model. That is, we generate M = 50 fake "active regions," each with Nk = 100 events, with flare sizes and waiting times generated from the SDP framework with $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$ as in Equation (B1). In what follows, we use the compact notation $x\sim { \mathcal N }(\mu ,\sigma )$ to denote that the random variable x is drawn from a Gaussian distribution with mean μ and standard deviation σ. To generate the data for each region, we select a value ${\tau }_{k}\sim { \mathcal N }({\mu }_{\tau }=3\,{\rm{day}},{\sigma }_{\tau }=0.5\,{\rm{day}})$, a value ${\xi }_{k}\sim { \mathcal N }({\mu }_{\xi }\,=2\,{\rm{arb.}}\,{\rm{units}},{\sigma }_{\xi }=0.3\,{\rm{arb.}}\,{\rm{units}})$, and a value ${\beta }_{k}\sim { \mathcal N }({\mu }_{\beta }\,=5,{\sigma }_{\beta }=0.3)$. Each Gaussian is truncated such that all variates are positive.

After running the sampler with the synthetic data, we perform a posterior predictive check, i.e., we compare the samples from our posterior distributions p(τ), p(ξ), and p(β) with the injected distributions and the priors on the hyperparameters. We show the results in Figure 9. We see that the posterior samples (blue) appropriately overlap with the injected ground truth (red) for τ, ξ, and β.

Figure 9.

Figure 9. Posterior predictive check demonstrating the recovery of injected population-level parameters. The weakly informative priors set on the hyperparameters τ, ξ, and β are shown in green, histograms of samples from the injected population parameters are shown in red, while histograms of samples from the posterior distributions are shown in blue. See text in Appendix B.3 for details on the procedure.

Standard image High-resolution image

Despite the encouraging results in Figure 9, we do not apply this hierarchical Bayesian scheme to the GOES catalog in this paper. This is because, in most regions, neither the waiting-time nor the size PDFs follow the functional form in Equation (B2). When we include Equation (B2) in the set of options available for the AICc to select between, as in Section 5.1, fewer than 0.1% of active regions are fitted best with Equation (B2). Therefore, Equation (B4) is not an appropriate likelihood for the data. Other choices of the jump distribution in Equation (B1) may alleviate this problem, but they lie outside the scope of this paper.

B.4. Likelihood Based on the Average Waiting Time

An alternative approach is to build the likelihood out of a summary statistic, such as the average waiting time in a given region, 〈Δtk , rather than each observed waiting time and/or size. In the SDP framework, we find (Fulgenzi et al. 2017; Millhouse et al. 2022)

Equation (B7)

where α0 ∼ 1 is a dimensionless constant whose exact value depends on the particular form of $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$. Restoring the dimensions to Equation (B7), we can write the likelihood as

Equation (B8)

with θ k = {τk , λ0,k , α0,k , σk }, where we assume the residual of 〈Δtk with the model subtracted is normally distributed, with zero mean and standard deviation σk . With the likelihood in Equation (B8), one could perform population-level parameter estimation as described in Section B.2. We leave this, as well as an exploration of how θ k depends on parameters intrinsic to each active region, to future work.

Appendix C: Cleaning GOES Flare Summary Data

The SWPC and NGDC File Transfer Protocol servers that host the GOES flare summary data, as well as other data portals such as the Heliophysics Event Knowledgebase, 8 contain numerous typographical anomalies found when collating the data into a homogeneous catalog. These anomalies are found in the recorded active region number by checking whether one has Δti,k < 14 days for 1 ≤ iNk − 1 for each active region. The limit of 14 days applies because active regions are typically only visible for two weeks due to the rotation period of the Sun. The one exception in the GOES flare summary data is a set of eight flares that occurred on 2002 November 2 and were assigned to active region number 10198. Two weeks later, the active region appeared on the eastern limb of the Sun, and 33 additional flares were assigned to active region number 10198, from 2002 November 17 until 2002 November 28. We opt not to consider the former eight flares as part of active region number 10198, and we do not include them in the catalog.

The anomalies and hand-corrected values are tabulated in Table 1. The corrected values are determined manually by referring to the context of surrounding flares in the database. For example, the flare starting at 1981 July 18 11:46 is recorded with active region number 3121. Yet the other 11 flares associated with the latter region occurred between 1981 May 24 and 1981 June 1, while active region number 3221 has 32 flares recorded between 1981 July 17 and 1981 July 29, indicating that the anomalous flare should be associated with the latter active region. The latitude of the anomalous flare is also within ±5° of other flares from active region number 3221, while flares in active region number 3121 are ≳10° higher in latitude.

Table 1. Anomalous Active Region Numbers Found in the GOES Flare Summary Data

Flare Start TimeAnomalous Active Region NumberCorrected Active Region Number
1978-05-30 06:1910001134
1981-07-18 11:4631213221
1981-08-22 06:583663266
1983-03-01 18:2421024102
1983-03-01 18:5421024102
1983-07-04 06:0941354235
1983-07-29 03:5342364263
1993-09-27 01:3575007590
2000-11-09 21:139125
2002-06-14 20:18110001
2003-07-31 07:5942210422
2003-12-21 04:1010000
2011-12-22 13:041128111381
2017-07-11 01:091265512665
2017-07-16 10:251265512665
2021-05-10 23:461228212822
2021-09-01 03:031268012860
2021-09-01 04:271268012860
2021-11-06 22:011298412894
2021-12-18 11:171280712907
2021-12-18 17:271280712907

Note. Anomalies found by checking whether one has Δti,k < 14 days for 1 ≤ iNk − 1 for each active region. Corrected values are determined manually, by considering the context of what active regions are present at the time of the anomalous flare at a similar latitude. Corrected values of "..." indicate that we cannot identify a reasonable active region to associate with the anomalous flare.

Download table as:  ASCIITypeset image

Appendix D: Log-normal Waiting-time Distributions

A sequence of instantaneous events ordered in time is known as a point process. If the instantaneous event rate does not change with time and the process is memoryless, it is called a Poisson point process and exhibits exponentially distributed waiting times (Kingman 1993). If the instantaneous rate does change with time, it is a nonhomogeneous Poisson process, which can exhibit waiting times that are distributed as a log-normal for certain rate functions (Gardiner 2009; Last & Penrose 2017). The SDP is one example of a nonhomogeneous Poisson process that can generate waiting times distributed as a log-normal, for certain choices of $\eta \left[{\rm{\Delta }}{X}_{i}\,| \,X({t}_{i}^{-})\right]$ and α. One can alternatively generate log-normally distributed values via a multiplicative process, e.g., an organism that grows in proportion to its current size multiplied by a random variable generates (at least approximately) log-normally distributed sizes, due to the Central Limit Theorem (Mitzenmacher 2003).

Log-normal waiting-time distributions are observed in many other contexts besides solar flares, including but not limited to gamma-ray bursts (Li & Fenimore 1996), X-ray bursts (Göğüş et al. 1999, 2000; Gavriil et al. 2004), fast radio bursts (Gourdji et al. 2019), network traffic (Paxson & Floyd 1994; Singhai et al. 2007), mining equipment failure (La Roche-Carrier et al. 2019), earthquake aftershocks (Peng & Zhao 2009), and test response times (van der Linden 2006).

Appendix E: Example Fits for Individual Active Regions

The reader may be curious to examine some examples of the variety of distributions fitted to individual active regions. To this end, in Figure 10 we show the complementary cumulative distribution function (CCDF) for the flare waiting times and sizes from four arbitrary but representative active regions with Nk ≥ 20. We display the distributions as CCDFs instead of PDFs, to avoid binning the data. The black stepped curves indicate the empirical CCDF constructed from the GOES catalog, while the colored curves are the various fits to the data, with parameters fixed to their maximum likelihood values. The distribution with bold typeface in the legend is the shape that best fits the data according to the AICc.

Figure 10.

Figure 10. Complementary cumulative distribution functions (CCDFs) for four arbitrary but representative active regions with Nk ≥ 20. The distribution that best describes the data in each panel (black stepped curve), according to the AICc, is in bold typeface in the legend for each panel. The gray regions in the bottom panels indicate Δs ≤ 10−3 Jm−2, i.e., where the masking described in Section 4.2 is applied.

Standard image High-resolution image

Footnotes

Please wait… references are loading.
10.3847/1538-4357/acc387