Ensemble Variability Properties of Active Galactic Nuclei in the SDSS DR17

We present the results from a study of ~9,600 Broad-Line selected AGN with host galaxies detected from the Sloan Digital Sky Survey Data Release 17 (SDSS DR17). We compute ensemble variability statistics based on the comparison of the original SDSS photometric data with spectrophotometric measurements obtained days to decades later in the Sloan g-, r-, and i-bands. Galaxy and AGN templates have been fitted to the SDSS spectra to isolate the AGN component from the host galaxy. The sources have absolute magnitudes in the range -24<M_i<-18 and lie at redshifts less than z~0.9. A variability analysis reveals that the anti-correlation between luminosity and variability amplitude continues down to log(L_bol[erg/s]) = 43.5, demonstrating that the relationship extends by 4 orders of magnitude in AGN luminosity. To further explore the connection between AGN luminosity and variability, we determine the black hole mass and the accretion rate through measurement of the H-Beta line width and the monochromatic luminosity at rest-frame 5,100 angstrom. Our results suggest that the accretion rate is the dominant parameter impacting the amplitude of variability and that the anti-correlation between accretion rate and amplitude extends to rates as low as 1% Eddington. Moreover, we also identify an anti-correlation between variability amplitude and black hole mass, with the correlation appearing strongest among the AGN with low accretion rates.


INTRODUCTION
Variability is a common feature of AGN and has been detected over a range of wavelengths and timescales (Matthews & Sandage 1963;Sesar et al. 2006).Since optical continuum radiation originates from the accretion disk, variability in the UV/optical regime is also thought to come from processes intrinsic to the accretion disk, such as inhomogeneities in the temperature at various radii (e.g., Ruan et al. 2014) and variations in the accretion rate (Pereyra et al. 2006;Zuo et al. 2012;Sartori et al. 2019).Reverberation mapping (e.g., Peterson et al. 2004) strongly implies that the accretion disk is the origin of the variability based on the lagging response of emission lines after continuum fluctuations.
To probe the origin of the variability, correlations between the variability properties and the characteristics of the AGN have been explored.For example, the amplitude of variability has been found to correlate with several AGN parameters such as luminosity, redshift, wavelength, and possibly black hole mass (Hook et al. 1994;Vanden Berk et al. 2004;Kelly et al. 2009;Li et al. 2018;Rumbaugh et al. 2018;Suberlak et al. 2021;Yi et al. 2022).A particularly strong anti-correlation has been identified between variability amplitude and AGN luminosity, indicating that more luminous AGN vary less than fainter ones (e.g., Wilhite et al. 2008;Zuo et al. 2012;Simm et al. 2016).
Since AGN luminosity, accretion rate, and black hole mass are interdependent, considerable care has been taken to determine the driving factor for this anti-correlation.In a study of more than 7,500 quasars from the SDSS, Zuo et al. (2012) found that the anti-correlation with accretion rate was more significant than that with luminosity when keeping other characteristics constant.More recently, Li et al. (2018, hereafter L18) and Yu et al. (2022) found that the amplitude of variability, particularly the long-term amplitude, is primarily determined by the accretion rate.Likewise, Rumbaugh et al. (2018) found that extreme variability AGN (those with fluctuations greater than a magnitude) generally have lower accretion rates than normal quasars with similar luminosities and suggest that the variability mechanism at very low accretion rates may be different than that for typical quasars.
To explore the relationship between various AGN parameters at fainter luminosities than previously studied, Gallastegui-Aizpun & Sarajedini (2014, hereafter GS14) selected a sample of ∼5,000 galaxies from the SDSS DR7 having broad emission lines and extended morphologies.By requiring that the sources have galaxy-like morphologies, the sample was not biased against intrinsically faint AGN with significant contributions from host galaxies.Using the approach of Vanden Berk et al. (2006, hereafter VB06), they separated the AGN and host galaxy light using linear combinations of eigenspectra templates for quasars and galaxies.The variability was quantified as the difference between magnitudes measured from the source photometry and follow-up spectroscopy in the Sloan g-, r-, and i-bands.They constructed an ensemble Structure Function (SF) for these sources and confirmed the anti-correlation found among quasars for AGN extending ∼2 magnitudes fainter than previously studied.While the amplitude of the variability was found to increase to the faintest luminosities, the slope of the SF became slightly shallower than that measured for more luminous QSOs.
In this paper, we expand on the study of GS14 using the SDSS DR17, the data release including observations through January 2021.We identified close to 9,600 galaxies with broad emission lines, extended morphologies, and sufficient spectroscopic and photometric signal-to-noise (S/N) necessary to model the host galaxy contribution and measure various AGN properties including the ensemble SF.We use this large sample of AGN to explore correlations between the ensemble variability properties and AGN characteristics such as accretion rate and black hole mass at lower luminosities than previously studied.
In Section 2, we describe the selection of the data set utilized for this study as well as the control set of normal galaxies to characterize the photometric noise.We explain the principle component analysis (PCA) decomposition used to determine the host galaxy fraction for each AGN and the procedure used to correct the variability amplitude for the host galaxy contribution.We also discuss the sample trimming based on the broad emission line measurements, spectroscopic signal-to-noise, and host galaxy fraction.In Section 3, we measure the broad emission lines for the AGN component of each source and determine the black hole masses and accretion rates for our sample.In Section 4, we compute the ensemble Structure Function for low-luminosity AGN and perform multidimensional fits of the AGN parameters and variability amplitudes.We discuss the implications of these correlations in the low-accretion rate, low-luminosity AGN regime for AGN in Section 5 and present conclusions in Section 6.Throughout the paper, we use photometric data from SDSS in their system (Lupton et al. 1999), which is similar to the AB photometric system, and ΛCDM cosmology with Ω M = 0.30, Ω Λ = 0.70, and H 0 = 70 km s −1 M pc −1 .

The Survey
The Sloan Digital Sky Survey is a photometric and spectroscopic survey covering almost 15,000 square degrees or about one-third of the entire celestial sphere in five broad bands, namely u, g, r, i, and z.We utilize data from the Data Release 17 (DR17), which includes imaging and spectroscopy data obtained over 19 years up to and including January 2021 (Abdurróuf et al. 2022).This is the final data release for SDSS-IV which contains more than a billion catalog objects, including more than 200 million galaxies.Among those, there are almost 3 million galaxies and 1 million quasars with spectroscopy.The photometric depth (5σ) of the survey is 22.15, 23.13, 22.70, 22.20, and 20.71 magnitudes for u-, g-, r-, i-, and z-bands, respectively.

Selection Criteria for SDSS DR17
We follow a similar approach as GS14 and perform the CasJobs' SQL query from the SDSS DR17 to gather all relevant parameters such as object ID, coordinates, PSF-corrected flux and magnitude, fiber flux and magnitude, spectroscopic flux and magnitude, median signal-to-noise in the respective band, observed Modified Julian Date (MJD), absolute magnitude, extinction, redshift, and K-correction factor.We select sources that are defined as morphologically extended (photometric type = 3) and display at least one broad emission line in the spectrum (spectral class = "QSO"), aiming toward the lower-luminosity AGN.We further required that the broad line measure at least 1,000 km s −1 and no more than 10,000 km s −1 .This ensures that the spectral classification is based on the presence of at least one robustly-detected broad emission line and that the emission line width is consistent with Doppler broadening of gas around black holes up to ∼10 9 M ⊙ .
We impose a redshift constraint of z≤0.84 to ensure that the spectra contain the entire broad Hβ emission feature.The Hβ emission line is useful for our determination of supermassive black hole mass (M SM BH ).The redshift cut guarantees sufficient overlaps between the observed spectra and the templates for high-quality fitted results.The total number of AGN that pass the above criteria is 32,828 sources.
We also require a sample of "normal" non-varying galaxies as a control group to gauge the photometric offsets and noise between different epochs.We select 2,450,896 normal galaxies (extended sources with no broad emission lines) from the entire SDSS DR17 with the same set of criteria except setting the spectral class to "GALAXY" to exclude objects with broad emission lines.We also apply the same redshift cut to this control group.

Measuring Variability
We calculate the variability of our sample using the differences in photometric magnitudes between two epochs; the imaging data and the spectroscopic data obtained for each galaxy in SDSS.Thus the baseline between two epochs can span from days to a decade.The spectroscopic reduction pipeline (Stoughton et al. 2002) calculates synthetic spectrophotometry for the g-, r-, and i-filters.The magnitude difference is computed as ∆g = g sp -g ph and similarly for other bands.Note that this is in the opposite direction of the magnitude differences from GS14, thus ∆g = -∆g GS14 .We drop the u-and z-bands from further analysis due to the lack of overlaps between the templates and the observed spectra in both bands for higher redshift bins of the sample.
During the summer of 2009, there was a change in the multi-object, fiber-fed spectrograph of SDSS (Smee et al. 2013).The spectrograph was upgraded for the Baryonic Oscillation Spectroscopic Survey as a part of the SDSS-III (Dawson et al. 2012).The change of spectrograph, however, resulted in significant differences between the spectrophotometry with the new instrument and the imaging photometry obtained from the previous version of the instrument.For this reason, we compute systematic offsets and noise characteristics for our sources in two separate groups, namely Group I and Group II.Group I contains AGN and galaxies with spectra taken before the Modified Julian Date (MJD) of 55,000 days (i.e., before the upgrade).Group II contains those with spectra taken after the upgrade.There are 13,112 and 19,716 AGN in Group I and Group II, respectively.Due to the different observing strategies between SDSS-I through -II and SDSS-III through -IV, Group II sources are at higher redshifts and, thus, fainter and lower S/N values.
To properly determine the magnitude difference between the two epochs, we use the SDSS aperture photometry within 3-arcsecond and 2-arcsecond diameters to closely match the fiber sizes of the spectroscopic observations of Group I and Group II samples, respectively.Since the spectrophotometry was calibrated using PSF-fitted magnitudes (Adelman-McCarthy et al. 2008), which includes light beyond the aperture of the spectroscopic fiber, the photometric offset varying significantly with S/N was observed among both Group I and Group II's AGN and normal galaxies.
To quantify the offset, we divide the sources into 16 S/N bins spread across the middle 95 percent of the sources (corresponding to the 2σ confidential interval) with an equal number of sources per bin.In each filter of Group I and II, the centroids and standard deviations of the magnitude difference distribution are measured.The left panels of Figure 1 and 2 show the distributions of magnitude differences as a function of S/N for Group I and Group II of the control sample (normal galaxies), respectively.The black crosses represent the median values of the magnitude differences in each bin.In the high S/N regime (S/N>5), the offset is -0.32 and -0.70 for Group I and Group II, respectively.
We fit a second-order Laurent polynomial to these values to determine the offset as a function of S/N for each group in different filters.The offset is then applied to the magnitude difference for each normal galaxy to center the magnitude difference distribution at zero.After centering in each band, we measure the spread using the standard deviation as a function of S/N for the control sample.This represents the "photometric noise" expected for non-varying galaxies in our survey.The right panels of Figure 1 and 2 show the Gaussian fits to the distributions of magnitude differences as functions of S/N.The photometric noise decreases almost exponentially as the S/N values rise.The Group II control sample has an overall higher photometric noise due to the change in spectrograph and observing strategies mentioned earlier.We also fit a second-order Laurent polynomial to these data points in both groups and use the derived functions in Section 4 to obtain the characteristic photometric noise, σ phot , for each AGN in the sample.
The offsets of magnitude differences are found to be slightly higher for the AGN sample compared to the normal galaxies sample in all bands.The same characteristics were also noted by GS14 and may be due to differences in the morphologies of AGN (e.g., a higher central concentration of light) compared to normal galaxies.Therefore, we compute the magnitude offset separately for the AGN sample in both groups and divide the samples into S/N bins following the procedure used for the normal galaxies.We determine the centroids of the magnitude differences shown as brown triangles in Figure 3.The left and right panels of Figure 3 show the magnitude differences offsets of AGN in Group I and Group II, respectively.For instance, at the S/N of 10, the offsets are -0.31 and -0.95 for AGN in Group I and Group II, respectively.We, again, fit a second-order Laurent polynomial to determine the offset as a function of S/N for each group in each filter.This offset is applied to the magnitude differences for all AGN in our sample to center its distribution at zero.The variability analysis in Section 4 makes use of these offset-corrected magnitude differences.
To demonstrate the intrinsic variability among the AGN compared to the non-varying galaxies, Figures 4 and 5 show the offset-corrected magnitude differences for the AGN sample (left panels) and the distribution of these values around the key S/N levels of 10 and 5 (right panels) with the distribution of the normal galaxy sample at the same S/N shown with a dashed line for Group I and Group II objects, respectively.A fit to the distribution of magnitude differences (dotted line) yields a Gaussian 1σ value of 0.22 for the AGN sample and 0.05 for the normal galaxies in the g-band for Group I. Similarly, the fits are 0.15 and 0.05 for the r-band and 0.13 and 0.06 for the i-band.In Group II, since the overall targets are fainter, the bulk of S/N distribution shifted toward lower S/N values.Thus, we show the distribution of the magnitude differences in Group II at a S/N of 5.The significantly larger magnitude differences for the AGN sample in all bands and both groups demonstrate their variable nature beyond the photometric noise.
Figure 1.Left: Change in magnitude differences as a function of spectroscopic S/N of Group I normal galaxies sample (control sample) in g-, r-, and i-bands, respectively.Right: The corresponding standard deviations (sigma values) of the corrected magnitude differences in each band.The black crosses represent the centroids of magnitude differences and the standard deviations in each bin for all bands.While the black-dashed lines are the best-fitted Laurent polynomials.

PCA Decomposition
To properly analyze the variability properties of our sample, we must first take into consideration the contamination of light from the host galaxy.Since we have selected the morphologically extended galaxies, the flux contribution from the host galaxy is non-negligible.Assuming that light from the underlying galaxy is not varying, the host galaxy's flux will dilute the observed variability of the AGN.The effect is more severe as the host galaxy contribution rises.To gauge the contribution from the host galaxy, we needed to separate the observed spectrum into two components consisting of the AGN and the host galaxy.We follow the formalism in deconstructing the AGN and host galaxy components as described in GS14 and VB06.Any source spectrum can be expressed as a linear combination of eigenspectra as shown here: where f R λ is the reconstructed flux density as a function of the wavelength of the interested component, while a k and e k (λ) are the k-order eigencoefficient and the corresponding eigenspectra, respectively.
We adopt the Principal Component Analysis or PCA fitting and decomposition routine from Hao et al. (2005), previously developed to analyze SDSS spectra for quasars and galaxies.However, we have modified the routine for use in Python utilizing SciPy's optimization with weights and penalties to fit all eigenspectra to the observed spectrum simultaneously.The eigenspectra for galaxies and quasars are characterized and made available by Yip et al. (2004a,b).For the eigenspectra of quasars, we pick the group that represents the brightest population at the lowest redshift range.We choose the low-redshift bin, namely "ZBIN 1", which covers the redshift range between 0.08 < z < 0.53, overlapping with the majority of AGN in our sample.We also select the highest luminosity bin, namely "C1", with the absolute magnitude of quasars between -26 ≤ M i ≤ -24 (Yip et al. 2004b).The latter selection guarantees that the quasar eigenspectra will have minimal contamination from the host galaxy.Our redshift limit of z≤ 0.84 ensures sufficient overlap between the AGN's rest-frame spectra and the eigenspectra for a high-quality fit and decomposition result.
The number of eigenspectra or PCA bases needed to fit and decompose the observed AGN spectra efficiently can be varied from a few to fifty vectors.However, many studies (e.g., Connolly et al. 1995;Connolly & Szalay 1999;Reichardt et al. 2001;Yip et al. 2004b;Gallastegui-Aizpun & Sarajedini 2014;Davies et al. 2018;Fagioli et al. 2020) have shown that the majority of the information in the spectra is contained within the first few vectors.For our purpose of gauging the contributions of AGN and host galaxy, we determine that using 5 galaxy eigenspectra and 10 quasar eigenspectra to perform the PCA decomposition is sufficient (e.g.,GS14; VB06).To separate the components, we assign weights (w(λ) = 1/σ(λ) 2 ) to the observed flux density; where σ(λ) 2 is the variance of each flux density value.The w(λ) is of the order unity except for around prominent emission features of the AGN (Stoughton et al. 2002).The regions within ± 5,000 km s −1 of Hα, Hβ, Hγ, and Hδ emission lines, and ± 600 km s −1 of [OII λλ 3727,3729] and [OIII λλ 4959,5007] are given weights of 0.001 to penalize the peaky emission features with high S/N (Hao et al. 2005).This approach mitigates the issue of overestimating the continuum levels in favor of fitting strong emission features.Thus, we ensure that the overall contributions of each component are as accurate as possible.
We measure the contribution from the host galaxy in a similar fashion as VB06 and GS14.This is known as the "Host Fraction" or "Ψ".The Ψ value of each object is the ratio between the integrated observed flux of the host galaxy component and the observed total flux (AGN + host) within the corresponding Sloan filter.Since the host fractions are derived from the reconstructed spectra, we can calculate Ψ in both observed-and rest-frames for all Sloan filters.Ψ can be expressed as follows: where R j λ represents the filter response as a function of wavelength in an arbitrary filter "j", and λ1 and λ2 are cut-on and cut-off wavelengths of the corresponding Sloan filter.While, f AGN λ and f Host λ are the reconstructed flux densities of the host galaxy and the AGN components, respectively.Finally, M j λ is the mask on the specific wavelengths within the corresponding filter.The mask blocks out the region with the size of 2.5× full width at half maximum (FWHM) of the broad emission lines (or ±3σ) of the AGN component and the region with the size of 1,000 km s −1 of the narrow emission lines of the host galaxy component.Thus, the host fraction value calculated in each filter is essentially the ratio between the average continuum levels of the host galaxy and the combined AGN + host galaxy.A few examples of the PCA decomposition results are illustrated in Figure 6.From the top to bottom panels of Figure 6, we show three AGN having a median S/N in the r-band around 10 and the corresponding host galaxy fraction values in the observed frame, Ψ r , of >0.8, ∼ 0.5, and <0.2, representing sources where the host galaxy is more than 80%, around 50%, and less than 20% of the total light.The masked areas are indicated by green-shaded regions.With the host fractions determined, we can calculate the magnitude difference between a pair of epochs for the AGN component corrected for the dampening effect of the non-varying host galaxy.The magnitude difference between 2 epochs of a host-subtracted AGN in any arbitrary band, "j", can be expressed as follows: (3) where ϕ j = 10 ∆mj /−2.5 is the corrected spectrophotometry to photometry flux ratio (f spec j /f phot j ) and k j = -0.2×(∆mj ).The value ∆m j is the offset-corrected total apparent magnitude difference (m spec -m phot ).The values of f spec j and f phot j are fluxes in any arbitrary band, j, from spectrophotometric and photometric epochs, respectively.Note that we cannot use the same host fraction values for both epochs since it contradicts the initial assumption that the host galaxy component is non-varying.Thus, we allow the Ψ value to vary (in the opposite direction) as the brightness of the AGN varies, or Ψ ′ = Ψ×f spec j /f phot j .Because the corrected magnitude differences of the AGN components are quite sensitive to the values of host fractions, we need to be certain that the estimated host fraction values obtained through this procedure are accurate and reliable.We verify the reliability of the host fraction (and ultimately AGN fraction) determination in the following section.

Reliability of AGN/host galaxy fraction from PCA fitting
Figure 7 shows the host galaxy fraction in the i-band, Ψ i , as a function of various observational parameters.Here we plot Ψ i versus the apparent magnitude from the SDSS photometry, the extinction corrected absolute magnitude of the AGN component as determined from the PCA fit, the source redshift, and the SDSS spectroscopic S/N in the i-band.We show the AGN from Group I and Group II as blue and orange points, respectively.The upper and lower left panels highlight the post-2009 survey differences, targeting higher redshift and fainter sources generally.The upper right panel shows that the luminosity of the AGN component is fainter where the host galaxy dominates, extending to M i ∼-18.There are no significant trends with Ψ i and S/N in the i-band, indicating that the full range of AGN-to-host galaxy ratios can be detected at all S/N values.
Since the Ψ parameter has a significant influence on the determination of the AGN magnitude and variability correction, we perform additional simulations to ensure that all sources included in our study have reliable estimates for host fraction.To test the accuracy of our PCA decomposition routines, we use the quasar and galaxy eigenspectra to generate a synthetic spectrum of an AGN.We tune the eigenvalues of the eigenspectra to match the range of Ψ i values from 0.0 to 1.0 and impose fluctuations of the flux densities for the synthetic spectra with a range of S/N i values from 2 to 50, consistent with the range of our sources.We simulate spectra in 40 bins of Ψ i and 32 bins in S/N i , equivalent to 1,280 cells on the Ψ i -S/N i space.In each cell, there are 25 synthetic spectra with random redshift values, drawn from the same redshift distribution of our sources, with the Ψ i and S/N i values drawn from a uniform distribution bounded by the cell's edges.Thus, we generate a total of 32,000 simulated AGN spectra.The simulated spectra are fed through our PCA decomposition routines.The deviations from the input values of Ψ i are recorded and averaged for each cell.Next, we increase the sampling along the Ψ i and S/N i grids to 100 and 192 bins (or 2.5× and 6×), respectively.The up-sampling is done by 2D interpolation with a linear estimator.Finally, we smooth the grids with a 2D Gaussian Kernel using a standard deviation of 4 (up-sampled) cells.The resulting contours of the average uncertainty of the recovered AGN fraction in the i-band as a function of the input AGN fraction, or (1-Ψ i ), and S/N i is shown in Figure 8.This figure is similar to the lower right panel of Figure 7, but with the y-axis flipped so that higher AGN fractions are at the top and host galaxy-dominated sources are at the bottom of the figure.
As one expects, the simulation shows that the least reliably recovered AGN fraction values have low S/N and low AGN fractions.Based on this result, we choose to include only the AGN with reliability greater than 95% in recovering the correct AGN fraction.This eliminates sources to the left and below the line labeled "5.000", removing most sources with S/N i <7 and having AGN fractions less than 20% (or host galaxy fraction larger than 80%).Due to these cuts, a substantial number of Group II AGN are removed from our sample.Based on the results of this simulation, we continue the variability analysis with 9,798 AGN (7,649 sources in Group I and 2,149 sources in Group II). 3. ANALYSIS

Broad Emission Lines Extraction
With our sample of reliably decomposed spectra, we carry out the remaining analysis using the host galaxy-subtracted spectrum for each AGN.Since we are interested in determining the black hole masses, we carefully fit and extract the parameters of the broad Hβ emission line, which is present in all spectra.The PCA decomposition procedure focused on the accuracy of continuum levels and extended features and penalized models that overestimate peaky features such as narrow emission lines.This can result in residuals of the narrow features in the host galaxy-subtracted AGN spectra.To account for this and accurately measure the width of the broad emission, we fit multi-component emission features around the Hβ emission line.
In the fitting process, we shift the spectra to rest-frame and crop the wavelength range around 4,500 -5,500 Å.We start the process by fitting a broad (1,000 km s −1 < FWHM ≤ 10,000 km s −1 ) and a narrow (FWHM ≤ 1,000 km s −1 ) Gaussian profiles to the position of Hβ and two Gaussian profiles (FWHM ≤ 10,000 km s −1 ) at the positions of the [OIII] doublets, simultaneously.We then mask the fitted emission features with the width of ±3σ around the lines' centroids as shown in the green-shaded regions in Figure 9.It is well known that F eII emissions are significant and observable in the UV through the optical regimes of AGN spectra (e.g., Osterbrock 1977;Boroson & Green 1992;Sulentic et al. 2000).They overlap with the Hβ and [OIII] features and can be modeled and fitted (e.g., Grandi 1981;Wills et al. 1985;Bruhweiler & Verner 2008;Kovačević et al. 2010;Pandey et al. 2024).Therefore, we fit the sum of five families of F eII emission features with velocity width between 1,000-10,000 km s −1 and a simple linear slope as the estimate of the continuum level through the unmasked patches of the spectrum (e.g., Oleas et al. 2016).Then, we subtract the continuum and F eII emission features from the original cropped rest-frame spectrum.After that, we fit the Hβ and [OIII] features again and update the best-fitted parameters of the emission features.The fitting steps are repeated several times until the extracted parameters of the Hβ emission features converge (i.e.changes in values less than 1%).From the fitting results of 100 randomly selected spectra in our sample, we determine that a minimum number of 3 iterations is necessary to yield a high-quality fit.This is based on the convergence of the Hβ emission width values, the residual of the flux density centered at zero with no clear systematic offsets, and the root-mean-square deviation being comparable to the square root of the flux density variance.We show examples of the emission lines fit in Figure 9.By accurately fitting and subtracting F eII, we prevent the systematic overestimation of the broad Hβ line's width.We perform the fitting algorithm on all host galaxy-subtracted spectra in our sample to accurately determine the width of the Hβ for our AGN.However, our fitting routine results in a surplus of sources with FWHM of the broad Hβ emission equal to 1,000 km s −1 .This is due to the fitting constraint imposed on the minimum width for broad emission features.Thus, we make another cut to remove sources with the best-fitted FWHM of Hβ with values less than 1,100 km s −1 .Therefore, we arrive at the final sample of 9,583 AGN (7,579 sources in Group I and 2004 sources in Group II).

Properties of the Low-Luminosity AGN Sample
With the broad Hβ line properly measured, we can compute the black hole mass for each source in our sample.We adopt the black hole mass equation from Park et al. (2015) using the width of the Hβ line with the monochromatic continuum luminosity of the AGN measured at 5,100 Å (i.e., log(M SM BH /M ⊙ )= 7.536 + 0.519log(λL 5100 /10 44 erg s −1 ) + 2log(σ Hβ /1,000 km s −1 )).The uncertainty in the determination of the host fraction (and AGN fraction) directly affects the measurements of flux density and ultimately the monochromatic luminosity at 5,100 Å.The reliability cut described and applied in Section 2.5 ensures that the uncertainty arising from PCA decomposition will be at most 25% for λL 5100 values.This yields an uncertainty of 0.07 dex in log(M SM BH /M ⊙ ), which is marginal compared to the scatter due to the scaling relation itself of 0.4 dex (Park et al. 2015).
We can also calculate the bolometric luminosity adopting the bolometric corrections of Duras et al. ( 2020) using the rest-frame Bessel-B-band (Bessell 1990) luminosity of the AGN (i.e., L bol =5.18×L B ).The B-band luminosities are computed by, first, convolving and integrating the extinction-corrected, host-galaxy-subtracted AGN spectra over the Bessel-B filter function.Next, the integrated B-band fluxes are multiplied with the redshift and distance dilution factors, (1+z) and (4πd 2 L ); where d L is the luminosity distance.With these measurements, we can then determine the accretion rate as the ratio of the Eddington luminosity and the bolometric luminosity.Then, we compute the Eddington luminosity as L Edd = 1.2 x 10 38 ×M SM BH where luminosity is in erg s −1 and black hole mass is in solar units (Eddington 1926).
Figure 10 shows the extracted parameters (i.e., accretion rate, black hole mass, and bolometric luminosity) versus redshift with the lower left panel showing the black hole mass vs the accretion rate as the Eddington Ratio, R Edd = L bol /L Edd .The dashed and dash-dotted lines indicate selection effects imposed on our sample due to survey depth and upper/lower limits placed on the fitted width of the Hβ emission line.We compare these properties to those for quasars in SDSS Stripe 82 (MacLeod et al. 2010).Most notable is the difference in the bolometric luminosity of our sample, which has a median log(L bol [erg s −1 ]) of 44.6 compared to the quasar sample's median value of 46.2, which is almost 100 times more luminous.Our median black hole mass, log(M SM BH [M ⊙ ]), is 7.8 compared to 8.9 for the  quasar sample.Finally, the median accretion rate of our sample is 5% Eddington, with 20% of the sample having accretion rates less than 1% Eddington.The accretion rate for the typical quasar sample is about 10 times higher.This demonstrates the unique nature of our sample of SDSS AGN to investigate trends of variability in previously unstudied AGN parameter space.

ENSEMBLE VARIABILITY CHARACTERISTICS
We compute the ensemble variability, V, using the following equation adopted from L18: where ∆m represents the magnitude difference of each AGN and σ S/N is the photometric noise as a function of S/N derived from the control galaxy sample in Section 2.3.Our equation differs slightly from L18 since we add a factor of 2 before the noise term, a necessary correction described in Koz lowski (2016).This accounts for the fact that our noise is computed as a standard deviation value and, therefore, must be doubled to represent the ∆m values of AGN, which vary both above and below the median value.

Structure Function
The Structure Function or SF is a method used to characterize the variability of AGN (e.g., Bauer et al. 2009;De Cicco et al. 2022;Vanden Berk et al. 2006;Gallastegui-Aizpun & Sarajedini 2014;Li et al. 2018).It is the relation between variability magnitude, V, and the rest-frame time lags, ∆τ , where the time interval between the two photometric measurements is corrected for time dilation effects by dividing by 1+z.Other techniques exist to quantify AGN variability such as computing the autocorrelation function (ACF).However, we compute the SF in order to compare with previously published results for ensemble variability of AGNs.In Figure 11, we show the ensemble SFs for the SDSS AGN sample, which is the averages of the V values of all AGN as defined in Equation 4as a function of time-lag ∆τ .The error bars are computed as the error on the mean of each bin.
The SF of each of the three filters is shown in blue (g-band), green (r-band), and red (i-band).The sample is binned into 12 equal intervals in log space (except for the first bin with the time lag of 0 to 10 days).Each rest-frame time lag bin contains between 62 and 1,618 sources with variability values above the photometric noises.We have applied a sigma-clipping routine to trim outliers from the sample in each interval.We perform 5σ-clipping with a maximum of 5 iterations to the ∆m values in each bin to minimize the impact of individual outliers with large magnitude differences when computing the ensemble behavior of the AGN in each time lag bin.This process removes 1 to 2% of sources from each bin or between 1 to 30 objects from the bins of 62 to 1618 objects.The outliers removed have unusually high variability (V>2 in some cases).Thus, this routine robustly minimizes the effect of these outliers in skewing the mean variability amplitude in each bin.The AGN SF is generally found to increase with the rest-frame time lag, approaching a maximum variability amplitude, SF inf , between 0.1 and 0.5 at 1-to 3-year time intervals (see MacLeod et al. (2010)).We can parameterize the SF with the following equation, where A and γ are the amplitude and slope of the function.As explained in L18, averaging over a large number of objects, which is necessary when computing ensemble statistics, will produce only a power-law relation since each source has its own set of SF parameters.Therefore, we fit only the slope of the relationship and show these fits as dashed colored lines corresponding to each band in Figure 11.The slopes of the SFs for our sample are between 0.208 and 0.217.These values are slightly shallower than those measured for low-luminosity AGN in GS14 but agree within 1σ uncertainties.Rather than determining the SF slope from least-square fitting, we perform 500 iterations of bootstrap resampling to quantify the spread of the SFs and adopt the medians and standard deviations as the slope values and their uncertainties as shown in Figure 11.Some studies have pointed out a caveat that estimating the uncertainty of the SF slope from least-square fitting may not represent the true underlying distribution of the slope values.For instance, Emmanoulopoulos et al. ( 2010) have shown that the uncertainties on the slopes and breaks of blazars' SFs are several times underestimated when only considering the results of the least-square fit method.
Several previous studies have measured the slope of the ensemble Structure Function for quasars.We show the most recent ensemble SF for quasars from L18 as solid lines in Figure 11.Colors correspond to the same bands as this work.The slopes for the quasar sample are generally shallower than our low-luminosity sources, with slopes closer to ∼0.160.Some of these differences in slopes could be because L18 did not multiply the noise component by a factor of 2 before subtraction, which would have the effect of making the slopes shallower.An earlier study by Vanden Berk et al. (2004) found ensemble SF slopes for SDSS quasars that were much steeper, with values closer to ∼0.3 in all bands.This is more consistent with the values published by Morganson et al. (2014) who found the slopes to be ∼0.25.All of these slopes are shallower than the predicted power-law slope of 0.5 for the damped random walk (DRW) model often used to describe quasar variability (e.g., Kelly et al. 2009).
The overall amplitude of the low-luminosity sample is greater than that of the quasar samples, most notably at long intervals.We further explore the variability amplitude as a function of AGN luminosity in Figure 12.Here we determine the ensemble value of V in bins of AGN i-band absolute magnitude in rest-frame.We confirm the previously observed anti-correlation between AGN luminosity and variability amplitude (e.g., Zuo et al. 2012;Simm et al. 2016).We find that the increase extends to absolute magnitudes of -18.This is consistent with the results from GS14 but is now shown with greater statistical significance.The slope of the relationship is shallower than that found in GS14.However, if we restrict to magnitudes fainter than -22, we find similar slopes and overall amplitudes with this previous study.
There is a break in the variability amplitude at absolute magnitudes around -20.This appears to be caused by the fact that many of the intrinsically fainter AGN in our sample from Group I had the longest time intervals between their photometric and spectroscopic measurements due to observational survey parameters.For this reason, their overall variability amplitudes tend to be among the highest.The non-uniformity of the dataset highlights the importance of separating the variability properties and AGN characteristics to investigate these trends.In the next section, we focus on doing this to quantify relationships between variability and each AGN property while constraining the impact of the other characteristics.

Variability versus Time lag, Luminosity, Redshift, Wavelength, Black Hole Mass, and Accretion rate
We adopt the strategy of L18 and perform a multidimensional fit to the SF as shown in Equation 5, with slope and variability amplitude as the dependences of bolometric luminosity, black hole mass, redshift, and rest-frame central wavelength of the observed band.In each parameter space, we split the sample into 5 bins, with equal spacing in log and linear spaces.For instance, the rest-frame wavelength and redshift spaces are each split into 5 equal-sized bins from λ c = 2,700 to 7,300 Å, and from z = 0.01 to 0.84, respectively.While bolometric luminosity and black hole mass spaces are split into 5 equal-sized bins along the log-scale from L bol = 10 43 to 10 46 erg s −1 , and from M SM BH = 10 6.5 to 10 9.5 M ⊙ , respectively.Moreover, we have 10 bins in the log-scale for time lags from 0 to 10 3.7 days.This equates to splitting our sample into 6,250 cells across the 5 parameter spaces.Next, using one of SciPy's non-linear least squares optimizations (i.e., Levenberg-Marquardt algorithm), we fit the multidimensional parameters of the variability amplitude and slope following the formalism in the work by L18 as expressed here: where A 0 are the intrinsic level of variability amplitude at the time lag of 1 year.B z , B L , B λ , and B M are the coefficients corresponding to the variability amplitude dependence on redshift, bolometric luminosity, the central restframe wavelength of the observed band, and black hole mass, respectively.The intrinsic slope of the Structure Function is represented by γ 0 .While, β z , β L , β λ , and β M are the coefficients corresponding to the SF slope dependence on redshift, bolometric luminosity, the central rest-frame wavelength of the observed band, and black hole mass, respectively.
To assess the uncertainties of these fitted parameters, we perform 500 iterations of bootstrap resampling of our AGN and disregard the cells with a total number of points less than 5. Thus, in any given iteration, the number of cells along any parameter space varies between 3 to 5, except for the time lag space.The multidimensional fitting is done following the prescription in Equation 6.The average and the 1σ uncertainty of each parameter is calculated based on the distribution of the best-fitted values of all bootstrap iterations and presented in Table 1 and 2. The first three rows of Table 1 show the SF slope dependence on these parameters.We find very little significant correlations between the parameters and the SF slope.There is a positive 1σ correlation with luminosity (Column 4), which is consistent with that identified in L18.Despite the broad range of masses sampled in our study, we find no significant dependence on black hole mass.The lack of dependencies found for redshift or wavelength may be due to the restricted redshift range of our sample.We check for changes in the fits when removing the redshift dependence in the second row and the wavelength dependence in the third row, and do not find any change in the correlations with the remaining parameters.
The first three rows of Table 2 show parameter fits for the variability amplitude.We find that amplitude depends on bolometric luminosity with a 3σ significance and black hole mass with 2.3σ significance.The variability amplitude dependence on bolometric luminosity confirms the well-known negative correlation demonstrated in Figure 12.We find almost the same dependence on bolometric luminosity as that reported for the quasars in L18.This result verifies that the trend with bolometric luminosity is robust over more than 4 orders of magnitude, extending to fainter magnitudes than previously studied.We also test the removal of the redshift dependence in the second row of Table 2 and find that there is a negative correlation with wavelength with a 1.6σ significance.As shown in Table 2, removing both redshift and wavelength dependence does not impact the correlations with the other parameters significantly.
Although the significance is low, we find a positive correlation between variability amplitude and black hole mass.Previous studies have reported mixed findings concerning this correlation with some identifying a positive correlation (e.g., Wold et al. 2007;MacLeod et al. 2010;Lu et al. 2019;Suberlak et al. 2021) and others reporting negative or unclear results (L18; Zuo et al. 2012;Simm et al. 2016).Zuo et al. (2012) found that the dependence is positive when the influence of luminosity is excluded, but negative when the Eddington accretion rate is excluded.To test this, we revise the fitting prescription in Equation 6 into the prescription as follows: where R Edd represents Eddington Ratio as an indicator of accretion rate onto a supermassive black hole and the other parameters remain the same as in Equation 6.This alternative formalism expressed in Equation 7 replaces the bolometric luminosity term with Eddington Ratio, R Edd .This change is also justified since the bolometric luminosity is related to black hole mass as inferred in Figure 10.Thus, the accretion rate may be a more fundamental AGN parameter.The results of this multidimensional fit are presented in lines 4 through 6 of Tables 1 and 2.
We find a negative correlation for accretion rate with variability amplitude with a 3σ significance, indicating higher variability amplitudes for lower accretion rates.This is consistent with the results of previously published quasar studies (e.g., Zuo et al. 2012;Simm et al. 2016;Sánchez-Sáez et al. 2018;Lu et al. 2019), though this is the first time the trend is confirmed at accretion rates less than log(R Edd ) of -1.75 (∼2% Eddington).We also confirm the results of Zuo et al. (2012) that the relationship with black hole mass switches from positive to negative depending on whether luminosity or accretion rate is considered.When the accretion rate is controlled, the trend becomes negative with 1.5σ significance.The negative correlation between variability amplitude and wavelength is still observed with almost 2σ significance, with bluer wavelengths displaying higher variability amplitudes.If we remove the redshift parameter from the fits, the relationship with wavelength becomes more significant, as does the relationship with accretion rate and amplitude, while the correlation with black hole mass becomes weaker.Removing both redshift and wavelength parameters does not significantly impact these results.
Figure 13 is a visual representation of the trends with variability amplitude, accretion rate, and black hole mass occurring in all three bands.The color of each square represents the average variability amplitude for all AGN in a given range of black hole masses and accretion rates.To obtain meaningful statistics, we exclude all the cells, which contain less than 10 AGN per cell.Warmer colors are associated with greater variability amplitudes.While it can be generally observed that the amplitude values increase with decreasing black hole mass, the amplitude increase is more obvious in the vertical direction as the accretion rate decreases.To test if the trends differ significantly between AGN populations with low and high accretion rates or between small and large black hole masses, we divide our sample at a black hole mass of log(M SM BH ) = 8 and accretion rate at log(R Edd ) = -1.25 and perform the multidimensional fit for variability amplitude with bootstrap resampling using the formalism in Equation 7. The direction of the trends remains the same for both accretion rate and black hole mass cases, but the negative correlation with black hole mass becomes greater (B M = -0.215±0.125)and slightly more significant among the low-mass black hole sub-sample.We also find that the negative correlation with accretion rate increases (B R Edd = -0.307±0.126)but decreases in significance among the low-mass black hole sub-sample.As illustrated in Figure 13, the negative correlation between black hole mass and variability amplitude is primarily driven by the low accretion rate AGN and this is confirmed by our multidimensional fits.At high accretion rates, the change in variability amplitude with black hole mass disappears (B M = -0.004±0.185).However, at low accretion rates, the change in amplitude with black hole mass demonstrates a stronger correlation (B M = -0.162± 0.151) with slightly greater than 1σ significance.Taken together, these results suggest that the accretion rate is the main driving parameter for variability amplitude with black hole mass having a less significant impact on this relation.
To make certain that our sub-samples in black hole mass and accretion rate are not biased in terms of the SF time intervals sampled, we check for any dependence of these properties on ∆τ .We find that the average black hole mass remains between log(M SM BH [M ⊙ ]) = 7.9 to 8.1 as a function of ∆τ , with a standard deviation of ±0.3, while the average Eddington Ratio clusters around log(R Edd ) = -1.40 to -1.60, with a standard deviation of ±0.27.This ensures that our sub-samples represent the full range of parameters over all timescales.

DISCUSSION
The significance of the accretion rate as a driving factor in variability amplitude that we identify has been previously suggested, though the precise reason behind the relationship is not entirely understood.MacLeod et al. (2010) argue that changes in the accretion rate are unlikely to produce this relationship since the accretion rate operates on longer viscous time scales that do not correspond to the shorter fluctuation time scales observed in variability studies.They also suggest that it could be the result of the amplitude anti-correlation with wavelength, which we also confirm in our study.Since higher accretion rates result in a hotter disk, this pushes the optical flux to a larger radius in the disk.If longer wavelengths are emitted further out in the disk, the lower variability amplitudes at long wavelengths could produce the observed relationship with the accretion rate.Sun et al. (2020) proposed the CHAR (Corona-heated Accretion-disk Reprocessing) model, suggesting that the outer accretion disk and the innermost corona of an AGN are magnetically coupled.Thus, while the turbulence in the corona drives variability in the X-ray domain, it also fluctuates the heating rate of the accretion disk, giving rise to the variability in the UV/Optical domain.This model also points to the anti-correlations between variability amplitudes with black hole mass and the Eddington Ratio for timescale between days to years.This aspect of the CHAR model is consistent with our finding as shown in Table 2 and Figure 13.Similarly, another model has been proposed by Yu et al. (2022) to explain the observed anti-correlation between amplitude and accretion rate.They argue that lower accretion rates result in a larger X-ray-emitting corona relative to the accretion disk.If the variable photons originate from this corona, this would produce a greater number of reprocessed photons in the disk resulting in a higher amplitude of variability.
The Yu et al. (2022) model also predicts the anti-correlation with black hole mass that we observe.They argue that the radius on the disk that emits at a fixed wavelength and accretion rate moves outward with increasing black hole mass.This results in a larger distance between the illuminating X-ray corona and the disk, producing lower intensity of the X-ray photons and thus a smaller amplitude of variability due to reprocessing of the photons in the disk.We observe the anti-correlation of variability amplitude with black hole mass when the accretion rate is accounted for.As predicted by their model, the black hole mass appears to have a secondary effect on the variability amplitude, displaying a weaker significance in the multidimensional fits than the accretion rate.
Recently, Arévalo et al. (2023) show that the relation between variability amplitude and black hole mass is negative when considering bins of constant accretion rate.They also show that the trend becomes more apparent on shorter timescales, arguing that the SF amplitude is relatively constant on long time scales and drops at shorter time scales, where the characteristic timescale varies with black hole mass.Our results are consistent with this finding, although we cannot confirm the trend being caused by changes in the characteristic time scale using ensemble statistics.We demonstrate, however, that the strength of the correlation is driven by sources at low accretion rates and is most apparent when sampling AGN with increasingly small black hole masses.

CONCLUSIONS
We have presented an ensemble variability analysis of a sample of ∼9,600 AGN selected from the SDSS displaying broad emission lines and extended morphologies.Using principal component analysis, we decompose the host galaxy and AGN light to determine the AGN fraction for each source.Variability is determined using the difference between the photometric observations of each galaxy and the spectrophotometry in the g-, r-, and i-bands.Black hole masses are determined using the broad Hβ emission line and flux at 5,100 Å.
Our AGN sample extends to fainter luminosities, smaller black hole masses, and lower accretion rates than previously studied samples, with the median value of black hole masses and accretion rates more than 10 times smaller than those Lastly, we would like to thank the anonymous referee, whose comments and constructive feedback contributed to the improvement of this paper.We also appreciate the time and effort that the referee and the editor have dedicated to the reviewing process of this manuscript.

Figure 2 .
Figure2.Left: Change in magnitude differences as a function of spectroscopic S/N of Group II normal galaxies sample (control sample) in g-, r-, and i-bands, respectively.Right: The corresponding standard deviations (sigma values) of the corrected magnitude differences in each band.The black crosses represent the centroids of magnitude differences and the standard deviations in each bin for all bands.While the black-dashed lines are the best-fitted Laurent polynomials.

Figure 3 .
Figure3.Left: Change in magnitude differences as a function of spectroscopic S/N of Group I AGN sample in g-, r-, and i-bands, respectively.Right: Change in magnitude differences as a function of spectroscopic S/N of Group II AGN sample in g-, r-, and i-bands, respectively.The brown triangles represent the centroids of magnitude differences in each bin for all bands for both Group I and Group II AGN.While the brown-dashed lines are the best-fitted Laurent polynomials.

Figure 4 .
Figure 4. Left: Group I AGN sample's magnitude differences corrected for the systematic offset.Right: The distribution of magnitude differences at S/N of 10 for the AGN sample (solid line) and fit (dotted line) shown with the control sample (dashed line)

Figure 5 .
Figure 5. Left: Group II AGN sample's magnitude differences corrected for the systematic offset.Right: The distribution of magnitude differences at S/N of 5 for the AGN sample (solid line) and fit (dotted line) shown with the control sample (dashed line)

Figure 6 .
Figure6.From top to bottom, the panels show typical PCA decomposition results when performed on host-dominated, 50%, and AGN-dominated spectra, respectively.The black solid lines are the observed AGN spectra.The cyan, blue, and red solid lines are the best-fitted total, AGN, and host galaxy components from the PCA decomposition, respectively.The black-dashed lines are the residuals of the fits.The green-shaded regions represent the areas on the spectra that have been masked out before calculating host fraction values of AGN.

Figure 7 .
Figure 7.The distributions of host galaxy fraction values in i-band, Ψi, with respect to apparent i-band magnitude (top left), absolute magnitude in i-band of AGN component (top right), redshift (bottom left), and median spectroscopic S/N in i-band (bottom right).The blue and orange circles represent the AGN with spectroscopy taken before and after the upgrade of the SDSS spectrograph (i.e., Group I and II AGN), respectively.

Figure 8 .
Figure 8.The reliability contours exhibiting uncertainty levels in recovering AGN fraction in i-band (1 − Ψi) in response to the initial 1 − Ψi and S/Ni values.The label values indicate the average uncertainty in the fitted AGN fraction in percent.The blue and orange circles represent the AGN with spectroscopy taken before and after the upgrade of the SDSS spectrograph (i.e., Group I and II AGN), respectively.

Figure 9 .
Figure 9. Examples of the emission features fitting procedure and quality of spectrum modeling of the AGN components, showing 3 sources with the median signal-to-noise around Hβ of 10.The fitted features include broad and narrow components of Hβ, [OIII] doublets, and F eII emission features with continuum level with a slope as indicated by red, cyan, and blue dashed lines, respectively.The green-shaded regions are the masked areas of the spectra for the process of F eII and continuum fits.The black-dotted lines represent the residuals after subtracting the best-fitted emission features and continuum.

Figure 10 .
Figure 10.Distribution of the properties of our low-luminosity AGN sample.Top panels: The scatter along Eddington Ratio and bolometric luminosity versus redshift, from left to right.Bottom left: The scatter along Eddington Ratio versus black hole mass.Bottom right: The scatter along black hole mass versus redshift.The color scales on the top and bottom panels represent the black hole mass and bolometric luminosity, respectively.

Figure 11 .
Figure 11.The Structure Function (SF), or variability magnitudes as a function of time lags between two epochs.The blue, green, and red filled circles are the mean variability in each time lag bin with the error bars representing errors on the means of each bin in g-, r-, and i-bands, respectively.The dashed lines are the best-fitted SFs with the slope values and their bootstrapped uncertainties shown in the top-left legends.The solid lines are the Structure Functions of the quasars sample from L18.

Figure 12 .
Figure 12.The variability magnitudes as a function of absolute i-band magnitudes of the AGN.The blue, green, and red triangles are the mean variability in each magnitude bin with the error bars representing errors on the means of each bin in g-, r-, and i-bands, respectively.The dashed lines are the best-fitted models with the slope values and their bootstrapped uncertainties shown in the top-right legends.

Figure 13 .
Figure 13.The binned mean variability amplitudes of our AGN sample as a function of black hole mass and accretion rate as Eddington Ratio in g-, r-, and i-bands, respectively.The variability amplitude in each cell is represented by the color scale.The horizontal axis represents the log of the black hole mass in solar masses and the vertical axis represents the log of Eddington Ratio.

Table 1 .
Best-fitted multi-variable parameters to the slope of the Structure Function.Each column of the table represents the following: (1) the parameters to be fitted across the column direction and the models used in the fitting process across the row direction; (2) the intrinsic slope of SF; (3) redshift coefficient; (4) bolometric luminosity or Eddington Ratio coefficient (depending on the fitted model); (5) rest-frame central wavelength coefficient; and(6) black hole mass coefficient.

Table 2 .
Best-fitted multi-variable parameters to the Structure Function amplitude.