A Possible ∼20 yr Periodicity in Long-term Optical Photometric and Spectral Variations of the Nearby Radio-quiet Active Galactic Nucleus Ark 120

We study the long-term variability in the optical monitoring database of Ark 120, a nearby radio-quiet active galactic nucleus (AGN) at a distance of 143 Mpc (z = 0.03271). We compiled the historical archival photometric and spectroscopic data since 1974 and conducted a new two-year monitoring campaign in 2015–2017, resulting in a total temporal baseline over four decades. The long-term variations in the optical continuum exhibit a wave-like pattern and the Hβ integrated flux series varies with a similar behavior. The broad Hβ profiles have asymmetric double peaks, which change strongly with time and tend to merge into a single peak during some epochs. The period in the optical continuum determined from various period-search methods is about 20 yr, and the estimated false alarm probability with null hypothesis simulations is about 1 × 10−3. The overall variations of the broad Hβ profiles also follow the same period. However, the present database only covers two cycles of the suggested period, which strongly encourages continued monitoring to track more cycles and confirm the periodicity. Nevertheless, in light of the possible periodicity and the complicated Hβ profile, Ark 120 is one candidate of the nearest radio-quiet AGNs with possible periodic variability, and it is thereby a potential candidate host for a sub-parsec supermassive black hole binary.


Introduction
Periodic brightness variability in long-term monitoring of active galactic nuclei (AGNs) is widely used to search for supermassive black hole (SMBH) binary candidates in modern time-domain surveys (e.g., Graham et al. 2015aGraham et al. , 2015bLiu et al. 2015Liu et al. , 2016Charisi et al. 2016;Zheng et al. 2016). Although there are alternative explanations, such a periodicity is generally believed to reveal the binary's orbital motion, which modulates accretion processes into the black holes, giving rise to periodic variations in disk emissions. To date, systematic searches from large surveys have yielded more than one hundred quasar and AGN candidates with periodic variability, distributed over a wide redshift range up to z∼3 (Graham et al. 2015b;Charisi et al. 2016;Liu et al. 2016). Due to the limited temporal baselines (10 yr), these detected restframe periods are confined to short timescales of several years or hundreds of days. There are also a number of quasars and nearby AGNs that were reported individually to exhibit periodicity based on long-term databases, including (but not limited to) OJ287 (z=0.31; Valtonen et al. 2008), NGC 4151 (z=0.003; Guo et al. 2006;Bon et al. 2012), and NGC 5548 (z=0.017; Bon et al. 2016;Li et al. 2016). OJ 287 is longknown to have strong radio emission, while NGC 4151 and NGC 5548 have relatively weak radio emission (Ho 2002).
In this paper, we study the variability and analyze the possible periodicity in Ark 120 (z = 0.03271 13 ), a nearby radio-quiet AGN with a radio-loudness of R≈0.1 (Condon et al. 1998;Ho 2002). Ark120 has been intensively observed and investigated in optical/ultraviolet (UV; e.g., Kollatschny et al. 1981aKollatschny et al. , 1981bSchulz & Rafanelli 1981;Alloin et al. 1988;Marziani et al. 1992;Peterson et al. 1998;Stanic et al. 2000;Popović et al. 2001;Doroshenko et al. 2008;Kuehn et al. 2008) and X-ray bands (e.g., Vaughan et al. 2004;Nardini et al. 2016;Reeves et al. 2016;Gliozzi et al. 2017;Lobban et al. 2018). The published photometric and spectroscopic monitoring data of Ark120 date back to 1974. Even earlier (1935)(1936)(1937)(1938)(1939)(1940)(1941)(1942)(1943)(1944)(1945)(1946)(1947)(1948)(1949)(1950), there were historical archival plates collected in the Harvard College Observatory that recorded the photometric images of Ark 120 . A recent campaign spectroscopically monitored Ark 120 for about 100 days in 2017 with the 2.3 m telescope of Wyoming Infrared Observatory (Du et al. 2018). We carried out further observations between 2015 and 2017 using the 2.4 m telescope of the Yunnan Observatories. We compile publicly available photometric and spectroscopic data of Ark120 and obtain a total temporal baseline stretching over four decades. By visual inspection, the long-term variations in the optical continuum display a wave-like pattern with a possible period of ∼20 yr. Moreover, the broad Hβ profile of Ark120 is highly asymmetric and varies strongly with time Du et al. 2018), a remarkable analogy to what was discovered in NGC5548 (Bon et al. 2016;Li et al. 2016).
The paper is organized as follows. Section 2 introduces briefly our new monitoring campaign and then presents our construction of a long-term database for Ark 120. Section 3 studies the long-term optical variability and measures the power spectral density (PSD). In Section 4, we perform detailed periodicity analysis for the optical light curve and estimate the associated false alarm probability. In Section 5, we summarize the long-term variations of the broad Hβ profile. A brief discussion on the possible periodicity is given in Section 6 and a conclusion is given in Section 7.

New Observations between 2015 and 2017
We carried out a two-year observation campaign between 2015 and 2017 using the Lijiang 2.4 m telescope, located in Yunnan Province, China, operated by the Yunnan Observatories. We used the Yunnan Faint Object Spectrograph and Camera with Grism 14, which covers a wavelength range of 3800-7200Å and provides a resolution of 1.8Åpixel −1 . This spectrograph is equipped with a slit long enough to allow us to simultaneously observe a nearby comparison star . We use this comparison star as a reference standard to achieve accurate absolute and relative flux calibrations, whereas the absolute flux of the comparison star is calibrated by observations of spectrophotometric standards during nights of good weather conditions. This flux calibration method is better than the common way based on the flux constancy of the narrow [O III] line as described below. However, we confirm that the narrow [O III]λ5007 lines of the calibrated spectra are constant with a mean of 0.96×10 −13 erg s −1 cm −2 and a scatter within 2.5%. The slit was fixed at a projected width of 2 5, and the spectra were extracted using a uniform window of 8 5. The raw spectroscopic data are reduced using standard IRAF version 2.16 routines before absolute flux calibration, which includes bias subtraction, flat-field correction, and wavelength calibration. The details for the data reduction and analysis will be presented in a separate paper that describes the reverberation mapping analysis (see also Du et al. 2014). We obtained 100 spectroscopic observations in total, with a typical exposure time of 30 minutes. The 5100Å continuum flux is measured in a band of 20Å wide around 5100Å in the rest frame. To be consistent with the other data sets, the Hβ flux is measured as follows: we first subtract the continuum underneath the Hβ line by interpolating between two continuum bands, 4760-4790Å and 5000-5020Å. We then integrate the continuum-subtracted Hβ flux between 4810Å and 4910Å.
The previously published spectroscopic observations were only to 2005, so there is no temporal overlap with our observations. As a result, to align the fluxes with the historical data, we need to correct for the differences in the host galaxy contamination. Based on the two-dimensional decomposition of the Hubble Space Telescope host galaxy image of Ark120 by Bentz et al. (2009), we estimate the relative difference of the host galaxy contribution to 5100Å fluxes though the apertures used in our observations and in Peterson et al. (1998)  10 −13 erg s −1 cm −2 as the reference data set used. 14 To account for the different aperture sizes, we apply a scale factor j and a flux modulation G to the 5100Å flux densities and Hβ fluxes of each data set with respect to the reference using (e.g., Peterson et al. 1995;Li et al. 2014)

Light Curves of 5100Å Continuum and Hβ Fluxes
The values of j and G are determined by comparing the closely spaced measurements of the two data sets. The interval for comparison is adopted to be 50 days for the data sets D08 and W96 in Table 1, which yields a good temporal overlap. For the data sets D99b and P89, there is a short interval (two yr) of temporal overlap with the other data sets. Meanwhile, the sampling of these two data sets is poor. We thus increase the interval for comparison up to 100 days to obtain a sufficient number of nearly contemporary observations. Since we concentrate on long-term variations, the choice of the interval has no influence on our analysis. It is impossible to intercalibrate the Hβ fluxes of our new observations with the other data sets because there is no temporal overlap. We present the light curve of the Hβ fluxes for the purpose of illustrating the variation behavior of the Hβ line. We do not use it for any subsequent periodicity analysis. For V-band photometric data, we first convert the magnitudes into flux densities by adopting the zero-point F λ (V=0)=3.92×10 −9 erg s −1 cm −2 Å −1 (Johnson 1966). We then scale the flux densities to align with the 5100 Å flux densities by again applying a scale factor (j) and flux 14 It is possible that the narrow [O III] line has a long-term secular variation as in NGC5548 (Peterson et al. 2013). We do not include such a variation in our intercalibration as there were no reliable absolute flux measurements over the whole period of the data sets. adjustment (G) to bring them to a common flux scale with the light curve of the 5100Å continuum using The interval for comparison is adopted to be 50 days for all of the data sets. Table 1 also lists the values of j and G for all of the spectroscopic and photometric data sets. In the right panels of Figure 1, we plot the intercalibrated light curves of Vband and 5100Å flux densities, and Hβ integrated fluxes. The original light curves without intercalibration are plotted in the left panels of Figure 1 for comparison. Appendix A tabulates a portion of the merged light curve of the 5100Å and V-band fluxes, while the entire light-curve data is available in online.

The Data Set of Miller (1979)
Miller (1979) compiled the B-band data of Ark 120 over 1935-1950 using the archival plate collection of the Harvard College Observatory. Albeit with poor cadence and low measurement accuracy, this data set still provides precious historical variation information of Ark 120. Indeed,  detected significant flares around 1937-1939. We digitalize Figure 1 of  and obtain the B-band magnitude data. It is impossible to intercalibrate this data set with the other data sets described above. Despite this, a useful comparison is possible by converting the B-band magnitudes into fluxes and then simply adjusting the fluxes to make the mean and standard deviation to be at the same scales as those of the other data sets. We do not include this data set for a subsequent analysis, but take it as an auxiliary data set to illustrate the long-term variations of Ark120. Figure 2 plots the light-curve data of  together with the merged light curve of the V-band and 5100Å continuum.

Long-term Optical Variations
As shown in Figure 1, all of the light curves of the V−band, 5100Å continuum, and Hβ integrated fluxes oscillate with a wave-like pattern. In the merged light curve plotted in Figure 2, there are clearly two major crests around 1985 and 2005 and two major troughs around 1995 and 2015. Meanwhile, one prominent flare appears around the bottom of each major trough. Both flares rise rapidly in flux by more than 50% within 200 days. The latest flare (2016-2018) then undergoes an even Notes. a The flux of [O III]λ5007 used for absolute calibration of spectra in the data set. b The photometric data of all of the data sets are at the V band, except for data set M79 at B band. c This data set has no temporal overlap with the other data sets so it is impossible to do intercalibration. We simply adjust the fluxes to make the mean and standard deviation to be at the same scales as those of the other data sets. We do not use this data set for a periodicity analysis. d This data set includes 15 additional observations (50,863-52,282) that have not been made public. e There is a large flux difference with respect to the other data sets. We offset the fluxes by −20.0×10 −15 erg s −1 cm −2 Å −1 before performing intercalibration. f This data set has a poor temporal overlap with the other data sets, and we simply fix j=1.
more sudden drop within the same timescale. All of the separations between peaks/toughs and between the locations of the flares are roughly 20 yr. 15 If we extrapolate such a repeated pattern to the period of 's data set, we find that the time (1937)(1938)(1939) of the significant flare reported in  remarkably coincides with the anticipated time of recurring flares with a period of ∼20 yr (see Figure 2). Unfortunately, the data from  are very noisy and scattering, therefore, we are unable to be sure whether the variations follow the wave-like pattern seen in the light curve from 1974 to 2018. On a short timescale of months, Ark120 also undergoes strong variations, which are commonly believed to originate from local instabilities of the accretion disk (e.g., Czerny et al. 2008;Kelly et al. 2009).

PSD Analysis
In Figure 3, we show the PSD of the merged light curve. We resample the light curve into an even time grid, subtract the mean flux to eliminate the zero-frequency power, and then calculate the modulus squared of the discrete Fourier transform at each sampled frequency f j (Uttley et al. 2002): where x i is the flux at time t i ; f j =j/NΔT with j=1, K, N/2, ΔT is the sampling interval; and N is the number of points in the data. The PSD is defined with an adopted normalization as To compare with the X-ray PSD, we also bin the logarithm of the PSD over a logarithmically spaced frequency grid, with a grid width of log 1.3 ( ) and a minimum of two power measurements per bin (Uttley et al. 2002). The uncertainties of the binned powers are set to be the Poisson noise power. Lobban et al. (2018) measured the PSD of Ark 120 in the X-ray band (0.3-10 keV) using ∼420 ks XMM-Newton monitoring data and additional Swift, RXTE, and Nuclear Spectroscopic Telescope Array (NuSTAR) observation data (see their Figure 13). We adjust the normalization of the X-ray PSD to align with the optical PSD and superpose it on the optical PSD in Figure 3. Figure 3 illustrates that both the optical and X-ray PSDs can be universally described by a single power law with a slope of ∼−1.6, despite some weak features in the f×P( f ) plot. The similar shapes of the optical and X-ray PSDs potentially imply that the optical and X-ray variabilities may have the same driving mechanism. A plausible scenario is reprocessing of X-ray emission into the UV/optical band (Guilbert & Rees 1988).
We apply three widely used PSD models to fit the optical light curve and compare the relative merits of these three models using the framework RECON developed in . This framework parameterizes the AGN time series in the frequency domain and transforms it back to the time domain to fit the data, and infers the model parameters using a Markov chain Monte Carlo (MCMC) algorithm (see  for details). Compared with methods that directly fit PSDs (e.g., Vaughan et al. 2016), such an approach can cope  Table 1. 15 We note that some numerical simulations of supermassive black hole binary systems produce a similar variation pattern (crests/troughs and flares) in mass accretion rates onto the binary black holes (e.g., Bogdanović et al. 2008Bogdanović et al. , 2009Miranda et al. 2017).
with irregularly sampled light curves and also naturally takes into account the sampling effects, such as spectral leakage and aliasing. In addition, the framework calculates the Bayesian evidence, and hence the Bayes factor, that allows us to perform model comparison. The shortcoming of this approach is that the (inverse) Fourier transform and MCMC sampling are highly computationally expensive for a large number of data points. To save computation time, we bin the light curve every 10 days, reducing the number of points from 1473 to 471. Such a binning manipulation degrades the maximum frequency that can be obtained from the data. Since we mainly concentrate on variations with timescales of years, this manipulation does not influence our main analysis results. It is worth stressing that in fitting the light curve, we assume independent Gaussian measurement noise.
Here, the Bayes factor is defined to be the ratio of the posterior probabilities (Sivia & Skilling 2006). For two models, say M 1 and M 2 , with equal prior weights, the Bayes factor is simplified to be the ratio of the corresponding Bayesian evidence, is the posterior probability; P D M 1,2 ( | ) is called the Bayesian evidence, which has been marginalized over the whole model parameters; and D represent the observed data. A large value of K>1 means that model M 2 is preferable to model M 1 . A widely used criterion for quantifying the degree of preference is given by Kass & Raftery (1995).
The three PSD adopted models are (1) the single power-law (SPL) model with a form of and (3) the bending power-law (BPL) model with a form of Here A, α, β, and f b are free parameters with a restriction of α>β. Throughout the calculations, the prior probabilities for A and f b are set to be logarithmic, and the value of f b is limited to the frequency range of the data; the prior probabilities for α Figure 2. Merged light curve of the V-band and 5100Å continuum and the light-curve data of  for Ark 120. The red dashed line represents a sinusoid with a period of 20.5 yr. Note that it is impossible to intercalibrate the data set of . We simply adjust the fluxes to make the mean and standard deviation to be at the same scales as those of the other data sets (see Section 2.3). We do not use the data set of  for an analysis. and β are set to be uniform over a range (1, 5) and (−3, 5), respectively. Table 2 summarizes the inferred parameter values for the above three PSD models. The MCMC algorithm adopted in the calculations can not calculate the uncertainty of the obtained Bayesian evidence in a single running. We ran the algorithm 10 times and set the best estimate of the evidence based on the mean and the associated uncertainty based on the standard deviation. The Bayes factor of the DRW and BPL models relative to the SPL model are K log 2.37 1.00 = - and 0.31±0.82, respectively. This indicates that all of the three PSD models give similarly good fits to the observed data.

Periodicity Analysis and False Alarm Probability
As described above, visual inspection reveals an interval of 20 yr between peaks/troughs, potentially implying that a ∼20 yr periodicity exists in the light curve. Below, we employ several methods to attempt to confirm the periodicity and estimate the associated false alarm probability.

Periodicity
We use three methods to identify possible periodicity in the light curve as follows.
1. Lomb-Scargle (LS) periodogram. 16 The top right panel of Figure 4 shows the LS periodogram of the merged light curve, which peaks at 20.5±0.1 yr. Here, the uncertainty is determined from the uncertainty of the peak frequency given by the formula in Horne & Baliunas (1986, see Equation (14) therein). The LS periodogram essentially assumes a sinusoidal model for the data (Scargle 1982; VanderPlas 2018). 2. Phase dispersion minimization (PDM) analysis. 17 The PDM method is widely used to search for non-sinusoidal periodicity by minimizing the dispersion of the phasefolded data sets (Stellingwerf 1978). The bottom right panel of Figure 4 plots the PDM periodogram, which is minimized at 20.5 yr, which is consistent with the period from the LS periodogram. 3. Sinusoidal fit. The sampling rate of the light curve is highly irregular, and some portions have quite dense cadences. Besides, the light curve fluctuates strongly on timescales of  Note. "SPL" means the single power-law PSD model, "DRW" means the damped random walk model, "BPL" means the bending power-law PSD model, and "periodic" means the periodic PSD component that is parameterized to be a Gaussian. K is the Bayes factor given with respect to the SPL model, and FAP means the false alarm probability.  16 We use the module LombScargle in the Python package astropy (Astropy Collaboration et al. 2018) to calculate the LS periodogram. The argument normalization="standard" is switched on to implement the "least-square normalization" of the periodogram. This leads the power of the periodogram to lie between 0 and 1. 17 We use the Python module CyPDM to implement the PDM analysis (Li 2018a).
weeks/months. To alleviate these two effects, we bin the light curve every 300 days, resulting in a total of 45 data points over a baseline of 43 yr. The uncertainties of points in the newly binned light curve are assigned to be the standard deviations of each bin. The sinusoidal fitting to the binned light curve yields a period of 19.7±0.5 yr and sinusoidal amplitude of (2.32±0.20)×10 −15 erg s −1 cm −2 Å −1 . The obtained period slightly depends on the adopted bin width, with the value changing between 19.5 and 20.0 yr for the bin width at a range of 100-600 days. Throughout the calculations, we use 300 days as the default bin width. In the left panel of Figure 4, we show the binned light curve and the best sinusoidal fit. The original, unbinned light curve is also superposed for the sake of illustration. We note that the LS periodogram is mathematically equivalent to a sinusoidal fit (e.g., Scargle 1982; Vander-Plas 2018). Here, we apply the sinusoidal fit to binned lightcurve data, for the purpose of alleviating the influences of the highly irregular sampling rate and strong fluctuations on short timescales. By contrast, we calculate the LS periodogram using the unbinned light-curve data.
In a nutshell, the above three methods generally yield periods of ∼20 yr. We use the period 20.5 yr as the fiducial period in the analysis that follows.

Comparison between Periodic and Aperiodic PSD Models
In Section 2.4, we apply three aperiodic PSD models to fit the light curve of Ark 120. We also construct periodic models by including a periodic component into the above three PSD models where P aperiodic ( f ) is the aperiodic component that can be P SPL ( f ), P DRW ( f ), or P BPL in Equations (7)-(9); and the second term on the right-hand side is the periodic component, which is parameterized to be a Gaussian with free parameters A p , f p , and ω p . The corresponding period of the light curve is 1/f b . Since the values of all of these three parameters can span a wide range of magnitudes, their prior probabilities are set to be logarithmic. The values of f p are additionally limited by the frequency range of the data. We tabulate the inferred parameter values for the above periodic PSD models in Table 2. Remarkably, the center of the Gaussian for the three models corresponds to a period of P log years 1.31, 1.32, 1.25 = ( ) / , respectively, with a typical uncertainty of 0.30 dex. This is in agreement with the period P log year 1.3 = ( ) from the LS periodogram. In Figure 5, we show the best recovered PSDs for both aperiodic and periodic models. The Bayes factors for the three periodic models with respect to the (aperiodic) SPL model are K log 3.34 1.00 = - , −1.75±0.97, and −2.78±0.76, respectively. If adopting K log 2.18 <as the criterion for rejecting a model (Kass & Raftery 1995), the obtained Bayes factors imply that for the current observation data, the periodic models are not preferable to the aperiodic models. Vaughan et al. (2016) showed that red noise stochastic processes can easily produce false positive periodicity in fewcycle light curves of AGNs. The current light-curve data of Ark120 only has roughly two cycles of the period. Thus, it is important to appropriately calibrate the false alarm probability (e.g., Koen 1990; Barth & Stern 2018;VanderPlas 2018). False alarm probability essentially measures the probability that an aperiodic variation process (the null hypothesis) can produce spurious periodicity as strong or even stronger than that detected in the observed data. We perform null hypothesis simulations as follows. For a given aperiodic PSD model, we run RECON to obtain a posterior sample of model parameters, from which we randomly draw a subsample to generate a set of mock data with exactly the same sampling pattern as the observed data. To simulate the observation noises, independent Gaussian noise with standard deviation equal to the measurement errors are added into the generated artificial data.

False Alarm Probability of the Periodicity
We then apply the three methods described in Section 4.1 on the artificial data and analyze the periodicity. For sinusoidal fitting, we again bin the light curve every 300 days. We identify a "periodic candidate" if the following criteria are satisfied: 1. the LS period of the artificial data is equal to or smaller than that of the observed data, to ensure that the artificial data covers as many cycles of the period as the observed data; 2. the peak power of the LS periodogram is larger than or equal to that of the observed data (P LS 0.68) and the minimum of the PDM periodogram is smaller than or equal to that of the observed data (P PDM 0.41); 3. a sinusoid improves the goodness of fit compared to a constant with Δχ 2 greater than or equal to that of the binned observed data (Δχ 2 150; see Section 4.1); 4. and the LS period agrees with the periods from PDM analysis and sinusoidal fitting at a level of ±1.0 yr.
The choice of the second criterion is for the purpose of selecting light curves with strong periodic variation amplitudes, the third criterion is to largely ensure roughly equal fluxes of the crests/troughs as seen in the light curve of Ark120, and the fourth criterion aims to reduce possible false detections in the LS periodogram (Graham et al. 2013). The obtained false alarm probabilities are roughly (5-10)×10 −4 (equivalent to a significance of ∼3.3σ) for the above three PSD models. Combining with the Bayes factors calculated in Section 4.2, the values of the false alarm probabilities suggest that the significance of the periodicity in Ark 120 is marginal. Continued monitoring of Ark 120 is, therefore, necessary to confirm the periodicity.

Requirement of Temporal Baseline for the Periodicity Detection
The preceding analysis shows that the evidence for the periodicity is inconclusive for the current data. A question then arises: how long do we need to monitor Ark 120 to ascertain that the periodicity is significant at a given confidence level? To address this question, we make the following assumptions: (1) the long-term variations are sinusoidal with the parameters given by the sinusoidal fitting in Sections 4.1, and (2) the variations at short timescales follow a single power-law PSD with the parameters A log 2.2 0.2 = - and α=1.6±0.1 from Section 3. The single power law is set to flatten to a constant below 1.0×10 −3 day −1 to confine the generated variations to short timescales. We generate a artificial light curve by summing up the above two variation components. Following the typical configurations of the observed data, the cadence of the artificial light curve is set to be 10 days apart, and the seasonal gap is set to last three months. A Gaussian noise with a standard deviation of 0.37×10 −15 erg s −1 cm −2 Å −1 is added into the artificial light curve. One quarter of points are discarded to mimic bad weather or instrumental problems. We then attach the artificial light curve to the observed data and use this new artificial data to calculate the false alarm probability as in Section 4.3. Appendix B illustrates three examples of artificial light curves for Ark120.
In Figure 6, we show the dependence of the estimated false alarm probability on the monitoring time length for three aperiodic PSD models. The current baseline of the observed database is 43 yr, corresponding to a ∼3.3σ confidence level (see Section 4.3). Under the observation configurations described above, an additional monitoring period of ∼30-60 yr (about 1.5-3 cycles) is required to reach a 5σ confidence level (equivalent to a false alarm probability of 6×10 −7 ). We stress that the adopted observation configurations are quite conservative and, therefore, the obtained monitoring length is just a conservative estimate. Higher data quality (e.g., higher signal-to-noise ratio and smaller seasonal gap) may reduce the required monitoring length.

Hβ Profile Variations
We also compile the historical Hβ profile of Ark120. The Hβ profiles are mainly from Stanic et al. (2000) and Doroshenko et al. (2008), which presented 10 averaged Hβ profiles over 1977-1990 and 21 Hβ profiles over 1992-2005 (see their Figure 5), respectively. In addition, Capriotti et al. (1982) displayed six spectra of Figure 6. Dependence of the estimated false alarm probability on the monitoring time length. The vertical dashed line represents the baseline of the observed data. Beyond the data baseline, the false alarm probability is calculated using artificial light curves for Ark 120. Note that the false alarm probabilities depend on the configurations of artificial data, which are described in Section 4.4. Ark120 over 1976-1981 (see their Figure 3), Korista (1992) presented 12 spectra over 1981-1989 (see their Figures 1-3), Peterson et al. (1998) show one averaged spectrum from around 1993 (see their Figure 1), and Afanasiev et al. (2019) presented one spectrum from 2014 (see their Figure 5). We digitalize these figures to obtain the profile data. Since we only concentrate on the shape of Hβ profile, the absolute fluxes are no longer important. Among these references, Stanic et al. (2000), Doroshenko et al. (2008), andAfanasiev et al. (2019) had already extracted the broad Hβ profiles. We directly use their extracted Hβ profiles. For the other spectra, we isolate the broad Hβ profiles following the procedure in Stanic et al. (2000). The narrow Hβ line and Fe II emission are subtracted using a simple spectral decomposition (see Stanic et al. 2000 for details). The bottom panel of Figure 7 shows an example for decomposing the historical spectrum from 1976 November 22 digitalized from Capriotti et al. (1982). Haas et al. (2011) displayed one spectrum from 2009 and Marziani et al. (1992) presented four spectra from between 1989 and 1990, with both these studies providing us with their electronic data. We queried the NASA/IPAC Extragalactic Database and found one spectrum from the 6dF galaxy survey (Jones et al. 2009) and one spectrum from the updated Zwicky catalog (Falco et al. 1999). We supply two mean Hβ profiles from our two-year monitoring. These spectra are of sufficiently good quality to perform a detailed spectral decomposition. The top panel shows an example of spectral decomposition for the mean spectrum between 2016 and 2017 from our Lijiang observations. The decomposition includes a power law for the continuum, two Gaussians for the narrow [O III] doublet and Hβ line, three Gaussians for the broad emission lines (Hβ and Hγ), an Fe II model shelf (Kovačević et al. 2010), and a host galaxy starlight model based on the ELODIE library (Prugniel & Soubiran 2001;Prugniel et al. 2007aPrugniel et al. , 2007b. Some other emission lines, such as He II, He I, N I, etc., are also included when necessary (modeled by one or two Gaussians). The flux ratio between [O III]λ5007 and λ4959 is fixed to be 3. A Legendre polynomial of the first order is multiplied to the sum of the above components to account for the errors in the Galactic extinction, flux calibration, or any other cause that affects the shape of the continuum. The fitting is implemented using the ULySS code 18 (Koleva et al. 2009). More details of the spectral decomposition were described in Bon et al. (2014Bon et al. ( , 2016. Furthermore, Denissyuk et al. (2015) collected spectroscopic monitoring data of Ark120 over 1976-2013 using a 70 cm telescope. We use 16 of their spectra between 2005 and 2011. The spectra are relatively noisy, and we decompose the broad Hβ line simply by subtracting the underlying continuum linearly interpolated between 4748Å and 5020Å. The narrow Hβ and [O III] lines are not subtracted.
In total, we obtain 76 profiles that cover a time span of 40.1 years from 1976 to 2017 and show them in Figure 8. We bear in mind that our compilation is incomplete and does not include those studies that investigated spectral variations of Ark 120 but did not display the spectra. The profiles are highly asymmetric and have double peaks (e.g., around 1981 and 1989), which seem to shift apart and merge again in some epochs. Moreover, the amplitudes of the red and blue peaks vary independently. These strong variations of Hβ profiles potentially reflect that the region emitting the broad Hβ line (the so called broad-line region) undergoes notable changes. Such behavior had also been found in the Hβ profiles of NGC5548 (Bon et al. 2016;Li et al. 2016). To determine whether there exist periodic variations in the Hβ profiles, we first normalize the Hβ profiles to a uniform integrated flux. We divide the Hβ wavelength region between 4741 and 4981 Å into 12 bins and obtain the "flux" in each wavelength bin. We then calculate the "multiband" LS periodogram that combines the variation information from all of the bins using the algorithm developed by VanderPlas & Ivezić (2015). The right panel of Figure 9 plots the "multiband" LS periodogram, which peaks around 20 yr. This is remarkably consistent with the period found in the continuum light curve. For the sake of comparison, the left panel of Figure 9 shows the standard LS periodogram in each wavelength bin. The periodograms show diverse patterns among wavelength bins, meaning that it is impossible to detect a period in individual bins under the current data quality.
We fold the profiles into 10 uniformly spaced phases using a period of 20.5 yr in the right panel of Figure 8. The phase bin width is ∼2 yr, and each phase bin has on average six profiles. It is more evident that in the folded profile series the red and blue peaks shift with phases significantly, and their amplitudes vary independently. For the sake of comparison, we plot the Hβ profiles from the first cycle  and second cycle  in blue and red, respectively, and the overall mean profile in black. The profiles from the two cycles are generally matched in terms of the separation and amplitude of the double peaks, further supporting that Hβ profiles may change with a similar period as the continuum and Hβ fluxes.  Capriotti et al. (1982), which has narrow wavelength coverage.
We calculate the Hβ line centroid and dispersion over a wavelength range (4741-4981)Å from the above spectra and show their changes with time in Figure 10. Despite large scattering, both the Hβ line centroid and dispersion generally exhibit a wave-like pattern as in the continuum variations. It seems that the centroid and line dispersion increase with the continuum flux density. This is difficult to explain if assuming that the broad-line region is virialized and its size obeys the tight scaling correlation between sizes of broad-line regions and AGN luminosities found in reverberation mapping observations (e.g., Kaspi et al. 2000;Bentz et al. 2013;Du et al. 2016). According to this scaling correlation, the size of the broad-line region expands as the AGN luminosity increases, leading to the line width decreasing if the broadline region is virialized. However, we stress that the present spectra data are compiled from various observations with diverse spectral resolutions, wavelength calibrations, and signal-to-noise ratios. Some of the digitalized spectra only cover the wavelength region of Hβ line so that it is impossible to recalibrate all of the spectra in a self-consistent way. Therefore, more high-quality spectroscopic observations are needed to test the results in this section. In Appendix A, we present a portion of the online table for the Hβ line centroid and dispersion data. Again, because it is difficult to reliably estimate the measurement noise for the digitalized spectra, we do not report the uncertainties for the calculated Hβ line centroids and dispersions.

Discussions on the Periodicity
Regarding the periodicity in AGN variations, there are indeed a variety of theoretic models/interpretations, including the SMBH scenario (see the discussions in Bon et al. 2016;Li et al. 2016;Lu et al. 2016). Some of the interpretations can be directly excluded in Ark120. The precessing jet model is implausible as Ark120 is radio-quiet (Condon et al. 1998;Ho 2002), so the optical continuum is most likely dominated by the disk emission. The other interpretations may be tested with the aid of high-quality reverberation mapping observations of broad emission lines (e.g., Shen & Loeb 2010;Wang et al. 2018). In particular, for the SMBH binary scenario, there are two factors that lead to distinct reverberation mapping signatures: first, the geometry and dynamics of the broad-line region(s) are different compared to those surrounding a single black hole, considering the complicated gravitational potential jointly governed by both black holes (e.g., Sepinsky et al. 2007;Popović 2012); second, if both black holes are active, there are doublet ionizing sources that illuminate the broad-line region(s), giving rise to distinguishable patterns in velocity-delay maps of broad emission lines ). Interestingly, a recent reverberation mapping campaign by Du et al. (2018) found a complicated velocity-delay map for the Hβ broad-line region of Ark 120, which shows a general decreasing trend from the blue (−3000 km s −1 ) to the red (4000 km s −1 ) wing but with a local peak around 1000-2000 km s −1 . A detailed investigation is needed to uncover the geometry and dynamics underlying such a velocity-delay map.
Ark 120 has a stellar velocity dispersion of σ å =192± 8 km s −1 (Woo et al. 2013 (Kormendy & Ho 2013). By assuming that an SMBH binary resides at the center of Ark120, the semimajor axis of the binary's orbit is a 27.0 • = lt-day (0.02 pc or 33 μas) if using 20.5 yr (in the observed frame) as the orbital period. Such a separation can be potentially spatially resolved either by the Event Horizon Telescope with an angular resolution of ∼20 μas (Fish et al. 2016) if each black hole produces radio emission, or by the Gaia satellite with an angular resolution down to 9 μas (D'Orazio & Loeb 2018). On the other hand, the characteristic strain amplitude of the gravitational-wave emission is of the order of h s ∼10 −17 , which is in the sensitivity range of nextgeneration pulsar timing arrays (Janssen et al. 2015). The previous Hβ reverberation mapping observations of Ark 120 show a typical size of 30-40 lt-day for the Hβ broad-line region Du et al. 2018), indicating that the binary orbit is slightly smaller than or comparable with the Hβ broadline region. In such cases, the interaction between the binary black holes and the broad-line regions is, therefore, important to shape the broad-line profiles.

Conclusion
We compile the historical archival photometric and spectroscopic data of Ark120 over four decades. The long-term variations of both the optical continuum and Hβ integrated fluxes exhibit a wave-like pattern. We analyze the periodicity using various methods and generally obtain a period of ∼20 yr. The estimated false alarm probability is about 1×10 −3 . The comparison between aperiodic and periodic models based on the Bayes factors suggests that the periodicity is inconclusive using the current data. Continued monitoring of Ark120 is needed to track more cycles to eliminate the false positive rate and confirm the periodicity. The broad Hβ line shows double peaks, which vary strongly with time. Using the "multiband" LS periodogram developed by VanderPlas & Ivezić (2015), we find that the overall Hβ profile also varies with the same suggested period in the continuum. These observations make Ark120 to be one of the nearest AGNs with possible periodic variability, a remarkable analogy to NGC 5548 (Bon et al. 2016;Li et al. 2016).
Although the evidence for periodicity is inconclusive using the current data, its abundant spectroscopic monitoring data still make Ark120 a good laboratory for studying the origin of asymmetric, rapidly varying broad emission lines in general, and evolution of SMBH binaries, particularly if the possible periodic variations stem from the binary's orbital motion.
We thank the referee and the statistics editor for their useful suggestions that improved the manuscript. We thank M. Haas for providing the monitoring data of Ark 120 and P. Marziani for providing the spectra of Ark 120. We acknowledge the support from the staff of the Lijiang 2.4 m telescope. This research is supported in part from the National Key R&D Program of China (2016YFA0400701 and 2016YFA0400702), from the Key Research Program of Frontier Sciences of the Chinese Academy of Sciences (CAS; QYZDJ-SSW-SLH007), and from the National Natural Science Foundation of China (NSFC;11690024, 11773029, 11833008, 11873048, and U1431228). Y.R.L. acknowledges financial support from the NSFC (11573026)  The "multiband" LS periodogram that combines the information from all of the wavelength bins, which is calculated using the Python package gatspy (Vanderplas 2015;VanderPlas & Ivezić, 2015). We tabulate our compiled optical light-curve data of Ark 120 between 1974 and 2018 in Table 3 and the Hβ line centroid and  dispersion data in Table 4. Both Tables 3 and 4 show a portion of the data and the entire tables are available in a machinereadable form online. Note that most of the compiled spectra are digitalized from published figures with insufficient wavelength coverage to estimate the measurement noise, so we do not report the uncertainties of the calculated Hβ line centroids and dispersions. Note. The data sets are the same as in Table 1. This table is available in its entirety in a machine-readable form in the online journal. Only a portion is shown here to illustrate its form and content.
(This table is available in its entirety in machine-readable form.) Table 4 The Centroid (l ) and Dispersion (σ) Data of the Broad Hβ Line Note. This table is available in its entirety in a machine-readable form in the online journal. Only a portion is shown here to illustrate its form and content.
(This table is available in its entirety in machine-readable form.)