A statistical study of the subsurface structure and eruptivity of solar active regions

A statistical study of 77 solar active regions (ARs) is conducted to investigate the existence of identifiable correlations between the subsurface structural disturbances and the activity level of the active regions. The disturbances examined in this study are $<|\delta \Gamma_1/\Gamma_1|>$, $<|\delta c^2/c^2|>$, and $<|\delta c^2/c^2-\delta \Gamma_1/\Gamma_1|>$. where $\Gamma_1$ and $c$ are the thermodynamic properties of first adiabatic index and sound speed modified by magnetic field, respectively. The averages are over three depth layers: $0.975-0.98 R_\odot$, $0.98-0.99 R_\odot$ and $0.99-0.995 R_\odot$ to represent the structural disturbances in that layer. The level of the surface magnetic activity is measured by the Magnetic Activity Index (MAI) of active region and the relative and absolute MAI differences (rdMAI and dMAI) between the active and quiet regions. The eruptivity of each active region is quantified by its Flare Index, total number of coronal mass ejections (CMEs), and total kinetic energy of the CMEs. The existence and level of the correlations are evaluated by scatter plots and correlation coefficients. No definitive correlation can be claimed from the results. While a weak positive trend is visible between dMAI and $<|\delta \Gamma_1/\Gamma_1|>$ and $<|\delta c^2/c^2|>$ in the layer $0.975-0.98 R_\odot$, their correlation levels, being approximately 0.6, are not sufficiently high to justify the correlation. Some subsurface disturbances are seen to increase with eruptivity indices among ARs with high eruptivity. The statistical significance of such trend, however, cannot be ascertained due to the small number of very eruptive ARs in our sample.


Introduction
Thanks to the increasing impact of space weather on modern society, many studies have been conducted to examine the relationship between the observed photosphere magnetic field and the production of solar flares (e.g., McAteer et al. 2005;Georgoulis and Rust 2007;Leka and Barnes 2007;Schrijver 2007;LaBonte et al. 2007;Song et al. 2009;Park et al. 2010;Ahmed et al. 2013, to name a few). The contributions of the subsurface flow dynamics to the flare production have also been investigated (e.g., Komm et al. 2004;Reinard et al. 2010;Komm et al. 2011). Despite different results reported from different studies, it is generally agreed that the complexity of magnetic field (or the deviation from a potential field) and the twisting of the foot points of field lines increase magnetic energy and stresses in the field, resulting in a more favorable environment for solar eruptions, and that the solar eruptions remove the excess magnetic energy and stresses from the field, returning it to a lower energy, more stable state.
In contrast, the relationship between the productivity of solar eruptions and thermal properties of the subsurface structure is largely unknown. It is uncertain whether the subsurface thermal structures can be related to solar eruptions through some physical mechanisms. Because of high gas-to-magnetic pressure ratio below the solar surface, the relationship, even if exists, can be expected to be very weak, and the average thermal structure should not show appreciable changes over the timescale of one eruption. Therefore, the relationship is likely to be detectable only among the active regions with sufficiently high eruptivity level and/or significant subsurface disturbances. The objective of this work is to investigate whether such relationship can be detected with current level of observational and technological capability. The strategy is to conduct a statistical study on the relationships between the disturbances of subsurface structural properties and the levels of both the coronal eruptivity and surface magnetic activity of the solar active regions (ARs). The results can shed lights on the physics involved in the interactions between gas and the magnetic field and the connection from below to above the solar surface.
The relationship between the subsurface thermal anomalies and surface magnetic activity has been examined by Bogart et al. (2008) and Baldner et al. (2013), and a positive correlation was claimed by both studies. Bogart et al. (2008) also examined the relationship between the subsurface thermal anomalies and total flare activity, but found no correlation between the two. However, solar flares are not the only eruptive phenomenon in the corona. The reported un-correlation with flare activity therefore does not completely rule out the possibility of a correlation with the eruptivity of active region and/or the productivity of other types of strong eruptions. Here we considered the contributions from both flares and coronal mass ejections (CMEs), two largest eruptive phenomena in the corona, in the assessment of the eruptivity of a region, and then conducted a statistical analysis of 77 active regions to examine the relationship between each available subsurface thermal properties and different indices that characterize the surface magnetic activity and coronal eruptivity. The subsurface structural differences of these regions are a subset of the inversion results of Baldner et al. (2013).
The details of the data source, the definition of different indices and the analysis procedures are described in Sec. 2, the results are discussed in Sec. 3, and a summary of the results is given in Sec. 4.

Disturbances of the subsurface structure
The disturbed thermal properties examined in this study are < |δc 2 /c 2 | >, < |δΓ 1 /Γ 1 | > and < |δc 2 /c 2 − δΓ 1 /Γ 1 | >, in which c is the thermodynamic property sound speed modified by the existence of magnetic field, and Γ 1 is the adiabatic index defined as (∂ ln P/∂ ln ρ) s , where P , ρ and s are gas pressure, density and entropy. We emphasize that c is not the travel speed of wave but a thermodynamic property defined as Γ 1 P/ρ. In a region free of magnetic field and away from ionization zones, P is simply the pressure of gas, and δc 2 /c 2 is a direct representation of the difference in temperature. However, because δc 2 /c 2 used in this study is the inversion result of solar active regions, which contain strong magnetic fields, P is not simply gas pressure, and δc 2 /c 2 in this case cannot directly reflect the temperature difference. δΓ 1 /Γ 1 mainly results from a difference in the ionization degree of gas or the equations of state. Thus, the quantity δc 2 /c 2 − δΓ 1 /Γ 1 can represent the part of the structural difference that is not due to the change of the ionization degree or the equations of state.
δc 2 /c 2 and δΓ 1 /Γ 1 were provided by Baldner et al. (2013, private communication), who applied ringdiagram analysis (Hill 1988) and local helioseismic inversion to obtain the differences between ARs and their co-latitude reference quiet-Sun regions (QSs). The data used in their analysis were the Dopplergrams from the Michelson Doppler Imager (MDI) instrument on-board the Solar and Heliospheric Observatory (SoHO). The results are the depth profiles of the relative differences averaged over a 15 • patch in space and 4 − 7 days in time. Since the difference between δc 2 /c 2 and δΓ 1 /Γ 1 is one of the quantities examined in present work, only those AR-QS pairs that have both δc 2 /c 2 and δΓ 1 /Γ 1 inversion results were included in this study. There are total 77 pairs. This collection of ARs covers time period from 1996 July to 2010 November. Noticing that δc 2 /c 2 − δΓ 1 /Γ 1 is generally large only in 0.98 − 0.99R ⊙ and small elsewhere, I divided the region where the inversion results are most reliable into three layers: The absolute values of the relative differences were averaged over these three depth ranges to represent the average structural disturbances in these layers.

Indices of surface magnetic activity levels
The level of surface magnetic activity of a region is characterized by the Magnetic Activity Index (MAI; Basu et al. 2004), which is defined as the total strong unsigned magnetic flux within the area of inversion (15 • patch) averaged over the tracking period (4−7days). In other words, the absolute values (in the unit of Gauss) of the strong fields are integrated over the strong-field area within the region of analysis, and averaged over the time interval of the analysis. The MAI of each region used in this work was provided by Baldner et al. (2013, private communication), who computed the values using the MDI magnetograms. Bogart et al. (2008) had reported positive correlation between the inversion results and MAI of AR (MAI AR ), and Baldner et al. (2013) also claimed a positive envelope between δc 2 /c 2 and the absolute difference of MAI between AR and QS. However, the inversion results are the relative structural differences between AR and QS region. If the magnitudes of MAI of QS regions are negligible compared with those of their pairing ARs, MAI AR would be equivalent to MAI AR -MAI QS , and would be appropriate to represent the difference in the magnetic activity between AR and QS. However, the MAI of several QS regions in our data set are more than 10% of the MAI of their pairing ARs. Therefore, in this study, we examined the relationships between the subsurface relative differences and three variations of the magnetic activity indices: MAI AR , dMAI≡MAI AR −MAI QS and rdMAI≡(MAI AR −MAI QS )/MAI QS .

Indices of coronal eruptivity levels
An ideal index to represent the level of eruptivity would be one that appropriately incorporate contributions of all types of eruptions. However, there is no generally accepted method to combine different types of eruptions. To avoid subjective bias in making the combination, the productivity of different types of eruptions was measured separately. The specific eruptions considered in this study were solar flares and coronal mass ejections, which are the two major eruptive phenomena in the corona. The level of eruptivity of each active region was thus gauged by its productivity of these two types of eruptions. Three indices, Flare Index (FI), total number of associated CMEs (Ncme) and total kinetic energy of the CMEs (KEsum) were derived to assess the productivity.
To derive the indices, a data base of the flare and CME events associated with each region during its visible time on the solar surface was constructed. The visible time was determined by checking the dates of the appearance and disappearance of each AR in So-larMonitor 1 . The CME events and their source regions were identified by examining images from different sources including EIT (Extreme-Ultraviolet Imaging Telescope; Delaboudinière et al. 1995) daily movies 2 , SOHO LASCO (Large Angle and Spectrometric Coronagraph; Brueckner et al. 1995) CME catalog 3 , STEREO/SECCHI data (Solar TErrestrial RElations Observatory/Sun Earth Connection Coronal and Heliospheric Investigation; Howard et al. (2008)), and STEREO COR1 CME catalog 4 . The SOHO LASCO CME catalog was also used for the information of estimated kinetic energy of most of the CMEs. The flare information was based on the NOAA/USAF Active Region Summary 5 .
Flare index (FI) is a quantity to quantify the daily flare activity over 24 hours per day (Antalová 1996), and is defined as where T is the time interval (measured in days), and S (i) is the sum of the significances of the peak flux (W/m 2 ) of flare class i as measured by GOES (Geostationary Operational Environmental Satellite; Garcia (1994)) over the interval T . Ncme is the total number of CMEs associated with an AR during its entire visible time. Each event, irrespective of its strength, is equally counted. There are several fainter ejections that were seen in EIT and STEREO COR1 but were not listed in the catalog. The number of these unlisted events is considered as the uncertainty of Ncme.
KEsum is the sum of the kinetic energy (KE) of the CMEs divided by 1E+30 erg, which is the average magnitude of the KE of an CME. The scaling was to reduce the magnitude of KEsum to avoid numerical problems. The LASCO CME catalog provides an estimated representative kinetic energy for many events. However, there are also many events that are listed but do not have an estimated KE in the catalog. These are often relatively faint events or events propagating in a direction that does not allow the measurements of the linear speed or mass from the view point of LASCO. The KE of such event was given a value of 1.E+29 erg. Event that was not detected by LASCO C2 but was seen in EIT and STEREO COR1 was given a value of 1.E+28 erg as KE because such event is usually fainter than the average ones. The sum of all the unlisted KEs is considered as the uncertainty of KEsum.

Analysis method
To simplify the notations for the rest of the paper, Y(i, j) is used to represent the three averaged subsurface structural differences: Y(0, j) to Y(2, j) denote < |δΓ 1 /Γ 1 | >, < |δc 2 /c 2 | > and < |δc 2 /c 2 − δΓ 1 /Γ 1 | >, respectively, and Y(i, 0) to Y(i, 2) represent the three depth ranges 0.975 − 0.98R ⊙ , 0.98 − 0.99R ⊙ and 0.99 − 0.995R ⊙ , respectively. The six activity/eruptivity indices are denoted by X(k): X(0) to X(5) stand for rd-MAI, dMAI, MAI AR , FI, Ncme and KEsum, respectively. The plan was to go through all possible combinations of Y(i,j) versus X(k) to see if any correlation can be identified between the subsurface disturbances and the activity/eruptivity above the surface. For each combination, a scatter plot was generated for visual inspection, and the level of correlation was assessed by the correlation coefficient (CC) (Barlow 1989): where x and y are two linearly related variables, σ x and σ y are their respective standard deviations, and x, y and xy are the means of x, y and xy, respectively. However, the correlation coefficient should not be used as a confirmation or rejection of the existence of a correlation (Press et al. 1992) because there is no universal way to evaluate the significance of the value of the correlation coefficient in different situations. Hence, in this study, it is only used as a reference to quantify the level of correlation after a linear trend is visually identified in the scatter plots, and CC=0.6 was chosen as the threshold for a correlation to be significant.

Results and discussion
The results of the subsurface relative differences versus rdMAI, dMAI, MAI AR , FI, Ncme and KEsum are plotted in Figures 1 to 8. In each figure, < |δΓ 1 /Γ 1 | >, < |δc 2 /c 2 | > and < |δc 2 /c 2 −δΓ 1 /Γ 1 | > are respectively placed in top, middle and bottom rows, and different depths are separated in different columns, as indicated on the top of each column. Figure 1 to 3 show the results of Y(i,j) vs. different indices of the surface magnetic activity. In all three figures, while < |δc 2 /c 2 | > shows a tighter distribution than < |δΓ 1 /Γ 1 | > in the layer 0.98 − 0.99R ⊙ , the two becomes almost identical in the deeper layer 0.975−0.98R ⊙ . Figure 1 shows no distinguishable regular pattern, indicating that the subsurface disturbances are uncorrelated with the relative difference of MAI. Interestingly, the profiles in Figure 2 and 3 are almost identical, suggesting that the relationships with dMAI and with MAI AR are very similar. Therefore, the values of CC are only shown in Fig. 2. and the results of Fig. 2 and 3 are discussed together in the following. Most of the plots do not show discernible regular patterns. A weak positive trend is visible in some plots of < |δΓ 1 /Γ 1 | > and < |δc 2 /c 2 | > (top two rows), especially in the layer 0.975 − 0.98R ⊙ . The correlation coefficients of these plots, being only approximately 0.6, are not sufficiently high to indicate a definitive correlation. The earlier studies by Baldner et al. (2013); Bogart et al. (2008), however, have claimed the existence of a positive correlation from their analysis. It should be noted that the subsurface anomalies analyzed in the two earlier studies are the averages of signed relative differences. The divisions of depth are also different between current and the two earlier studies.
The results of Y(i,j) vs. FI are plotted in Figure 4. The figure reveals that all but two points are located below FI = 50 and that most points are concentrated in a small region of FI < 10. The two points with outstandingly high FI are AR10488 and AR10656. To better inspect the patterns in the populated region, the region of FI < 50 was re-drawn in Figure 5. The points below FI = 10 in all plots of Figure 5 are widely scattered with no identifiable pattern. In the region of FI = 10 − 50, an approximately positive trend can be seen in a few panels. The correlation coefficients for points in this range of FI are shown in the corresponding panels. There are three panels with CC higher than 0.6: bottom left, < |δc 2 /c 2 − δΓ 1 /Γ 1 | > 0.975−0.98R⊙ , middle middle, < |δc 2 /c 2 | > 0.98−0.99R⊙ , and middle right, < |δc 2 /c 2 | > 0.99−0.995R⊙ . Despite the high correlation coefficients, there are only ten ARs in this FI range, and the positive trend seems to appear in three arbitrary panels. Therefore, no conclusion can be drawn regarding the relationship between the flare productivity and the subsurface thermal disturbances. The ten ARs with FI between 10 and 50 are listed in Table 1.
The results of Y(i,j) vs. Ncme are presented in Figure 6. Except for one outlying point at Ncme = 19, all other points are located below Ncme ≈ 13. The single exceptional point is AR09390. Despite its high productivity of CMEs, the FI of AR09390 is only approximately 10, and its subsurface thermal disturbances are no larger than those of the rest of the data points. Therefore, in the figure of Y(i,j) vs. FI (Fig. 5), this point corresponds to a point at the lower left corner of the range in which a positive trend is seen. The distributions in Figure 6 in general show no clear regular pattern. However, in some panels in the middle and right columns of the figure, a weak positive trend can be seen in the range Ncme > 5. The correlation coefficients for the points in this range (Ncme = 5 − 15) are printed in corresponding panels. The values are mostly unremarkable, suggesting that the productivity of CMEs is not strongly related to most of the subsurface thermal properties. The plot with the highest CC is < |δc 2 /c 2 − δΓ 1 /Γ 1 | > in 0.99 − 0.995R ⊙ (CC≈ 0.77). While this suggests that certain mutual effects between CME production and < |δc 2 /c 2 − δΓ 1 /Γ 1 | > may be detectable just beneath the surface, the number of ARs with Ncme larger than 5 in our sample, being only twelve, is insufficient to justify this implication. The twelve ARs that form this positive trend are listed in Table 1.
The results of Y(i,j) vs. KEsum are shown in Fig. 7. Most of the points are distributed below KEsum ≈ 20 except for four points, AR09433 (KEsum = 57.2), AR09390 (KEsum = 68.5), AR10792 (KEsum = 83.8) and AR09787 (KEsum = 340). To better inspect the majority of points, Y(i,j) vs. KEsum was re-plotted for the range of KEsum ≤ 20 in Figure 8. The figure reveals a gap around KEsum ≈ 8, above which a positive trend is visible in all panels. Below this gap, while the distribution patterns are more complex than a linear trend, they are not as randomly and widely scattered as the patterns in Fig. 5 and 6. The correlation coefficients for the points located between KEsum = 8 and 20 are shown in corresponding panels. Most of the values are equal to or higher than 0.6, indicating a good level of correlation. However, this indication cannot be confidently verified in the current study because this positive trend consists of only ten ARs. These ten ARs are listed in Table 1. It is worth pointing out that although the positive trends seen in the scatter plots of the three eruptivity indices are all composed of approximately ten ARs, there is less than 50% overlap in the identity of the ARs, as revealed in Table 1.

Summary
A statistical study of 77 ARs was conducted to investigate the possibility of correlations between the subsurface structural disturbances of ARs and their surface magnetic activity and coronal eruptivity. The specific subsurface disturbances examined were < |δΓ 1 /Γ 1 | >, < |δc 2 /c 2 | > and < |δc 2 /c 2 − δΓ 1 /Γ 1 | >, where Γ 1 and c are two thermodynamic properties defined as Γ 1 ≡ (∂ ln P/∂ ln ρ) s and c ≡ Γ 1 P/ρ, in which P , ρ and s are pressure, density and entropy. The absolute values were averaged over three ranges of depth: 0.975 − 0.98R ⊙ , 0.98 − 0.99R ⊙ and 0.99 − 0.995R ⊙ , to represent the average structural disturbances in these layers. The surface magnetic activity was measured by the Magnetic Activity Index of AR (MAI AR ) and the relative and absolute differences of MAI between AR and QS (rdMAI, dMAI). The coronal eruptivity level was gauged by Flare Index (FI), total number of CMEs (Ncme) and total kinetic energy of these CMEs (KEsum) of each AR. The subsurface disturbances are denoted by Y(i,j) and activity and eruptivity level indices by X(k) for simplicity. The analysis consisted of visually inspecting the scatter plots of different pairs of variables and calculating the correlation coefficients of the distribution patterns.
The subsurface anomalies and MAI AR and dMAI were reported to be positively correlated by earlier studies (Bogart et al. 2008;Baldner et al. 2013). However, the positive trend is only visible in the deepest layer 0.975 − 0.98R ⊙ in our analysis. With a correlation coefficient value approximately 0.6, this correlation cannot be claimed by our study. It should be noted that the quantities examined here are the averages of unsigned relative differences while the two earlier studies had used the averages of signed values. The divisions of the depth also differ between current and the earlier studies. When Y(i,j) were plotted against the relative MAI differences, no correlation can be identified.
The scatter plots of Y(i,j) vs. three eruptivity indices indicate that the distribution profiles change with the magnitudes of the eruptivity indices. In the region of low eruptivity, the points are generally widely scattered with no distinguishable regular patterns. The points only become more tightly and orderly distributed in the region with higher eruptivity, and a positive trend with CC≥ 0.6 can be seen in some plots (cf. Table 2). Among the three eruptivity indices, the positive trend is most prominent in KEsum. The level of correlation is ≥ 0.6 in most plots of Y(i,j) vs. KEsum. In Y(i,j) vs. Ncme, while the correlation level of the trend is in general low, a pattern of the correlation coefficient becoming higher in the shallower layer can be seen. The occurrence of the positive trend in Y(i,j) vs. FI, in contrast, does not show identifiable regularity. These distributions with relatively high correlation levels are all composed of only 10 − 12 points. It is, therefore, uncertain whether the tight distribution indicates that the correlations are detectable only among sufficiently eruptive ARs or after many strong eruptions, as speculated in Sec. 1, or whether it is simply a result of fewer points. In addition, many studies have pointed out that the helioseismic inversions based on the identification of oscillation phases can be contaminated by surface effects (e.g. Couvidat and Rajaguru 2007;Cally 2009). Although the subsurface structural differences here were obtained by inverting the frequency differences determined by the ring-diagram analysis (Hill 1988), the surface effects may distort the profile of the ring spectra resulting in errors in the determination of frequencies (Cally 2013, private discussion). Therefore, the "positive trend" may also partly be a result of the inversion results containing effects propagating down from the corona, rather than an indication of a true correlation between the subsurface thermal structural disturbances and the coronal eruptivity. Therefore, no definitive correlations can be claimed at this stage. To verify the correlations suggested by the analysis in this study, we will need to first improve the current Helioseismic inversion procedures, and then apply the structural inversions to more ARs with high eruptivity. A theoretical study of the relationship between the subsurface disturbances and the eruptivity in the atmosphere is also necessary.
Lastly, it is interesting to note that the distribution pattern seems to depend little on the magnitude of the subsurface disturbances. In other words, considering only the ARs with larger, or smaller, subsurface disturbances does not lead to more ordered distribution. Fig. 1 The scatter plots of the subsurface structural disturbances in different depths vs. rdMAI. Different structural differences are placed in different rows: < |δΓ1/Γ1| > (top), < |δc 2 /c 2 | > (middle), < |δc 2 /c 2 − δΓ1/Γ1| > (bottom), as labeled on the left of each row. Results of different layers from deep to shallow are separated into columns from left to right, as indicated on the top of each column. Each point corresponds to one active region, and is located according to Y(i,j) and X(k) of that active region.  Fig. 2 The scatter plots of the subsurface structural disturbances in different depths vs. dMAI. The arrangement of the panels and symbols is the same as in Fig. 1.  Fig. 3 The scatter plots of the subsurface structural disturbances in different depths vs. MAI of AR. The arrangement of the panels is the same as in Fig. 1.  Fig. 4 The scatter plots of the subsurface structural disturbances in different depths vs. FI. The arrangement of the panels is the same as in Fig. 1. The dotted lines mark the locations of FI= 10 and 50, within which a weak positive trend is visible in some plots (bottom left and middle middle).

Fig. 5
The scatter plots of the subsurface structural disturbances in different depths vs. FI for FI< 50. The arrangement of the panels is the same as in Fig. 1. The dotted line marks the location of FI= 10. The number shown in each panel is the correlation coefficient of the points in the range 10 <FI< 50. Fig. 6 The scatter plots of the subsurface structural disturbances in different depths vs. Ncme. The arrangement of the panels is the same as in Fig. 1. The dotted lines mark the locations of Ncme= 5 and 15, within which the correlation coefficients are shown. Fig. 7 The scatter plots of the subsurface structural disturbances in different depths vs. KEsum. The arrangement of the panels is the same as in Fig. 1. The dotted line marks the location KEsum = 20, below which the majority of the points are located. Fig. 8 The scatter plots of the subsurface structural disturbances in different depths vs. KEsum for KEsum < 20. The arrangement of the panels is the same as in Fig. 1. The dotted line marks the location KEsum = 8. A rising profile can be seen on the right of the line in most panels. The number shown in each panel is the correlation coefficient calculated for the points located within 8 < KEsum < 20.