Multi-Messenger Gravitational Wave Searches with Pulsar Timing Arrays: Application to 3C66B Using the NANOGrav 11-year Data Set

When galaxies merge, the supermassive black holes in their centers may form binaries and, during the process of merger, emit low-frequency gravitational radiation in the process. In this paper we consider the galaxy 3C66B, which was used as the target of the first multi-messenger search for gravitational waves. Due to the observed periodicities present in the photometric and astrometric data of the source of the source, it has been theorized to contain a supermassive black hole binary. Its apparent 1.05-year orbital period would place the gravitational wave emission directly in the pulsar timing band. Since the first pulsar timing array study of 3C66B, revised models of the source have been published, and timing array sensitivities and techniques have improved dramatically. With these advances, we further constrain the chirp mass of the potential supermassive black hole binary in 3C66B to less than $(1.65\pm0.02) \times 10^9~{M_\odot}$ using data from the NANOGrav 11-year data set. This upper limit provides a factor of 1.6 improvement over previous limits, and a factor of 4.3 over the first search done. Nevertheless, the most recent orbital model for the source is still consistent with our limit from pulsar timing array data. In addition, we are able to quantify the improvement made by the inclusion of source properties gleaned from electromagnetic data to `blind' pulsar timing array searches. With these methods, it is apparent that it is not necessary to obtain exact a priori knowledge of the period of a binary to gain meaningful astrophysical inferences.


INTRODUCTION
Continuous gravitational waves (GWs), defined by single-source cyclic GW emission, are expected to arise from the supermassive black hole binaries (SMBHBs) that form during a galaxy merger. When a SMBHB evolves such that it emits GWs in the microhertz to nanohertz GW band (orbital periods of weeks to several decades), a sufficiently massive and/or nearby SMBHB may be detectable by pulsar timing arrays (PTAs; e.g., Aggarwal et al. 2019) (hereafter A19).
While GWs from individual sources in the PTA regime have been sought after in multiple works (Arzoumanian et al. 2014;Feng et al. 2019;Jenet et al. 2004) through a variety of methods, none have been detected. However, numerous advances have been made in the field of pulsar timing. As PTA experiments gain longer time baselines and higher cadences and the numbers of millisecond pulsars grows, sensitivity to GW sources increases. * NANOGrav Physics Frontiers Center Postdoctoral Fellow The notable ongoing PTA programs in the world include the European PTA, Parkes PTA, and the North American Nanohertz Observatory for Gravitational Waves (NANOGrav) (e. g. Desvignes et al. 2016;Manchester et al. 2013;Arzoumanian et al. 2018a, respectively). Altogether, these PTAs time approximately 100 pulsars to high precision with the goal of GW detection, among other endeavors (e. g. Caballero et al. 2018;Hobbs et al. 2020).
In addition, sophisticated GW detection methods have been developed to detect quadrupolar continuouswave signals in the data of coordinated timing arrays (e. g. Babak et al. 2016;Zhu et al. 2014;Aggarwal et al. 2019). However, past analyses that used the most up-todate methods have used 'blind' detection methods; that is, the software did not consider any binary model information gained from electromagnetic data to directly benefit the search. In comparison, most works that do target specific sources using electromagnetic information have used smaller data sets consisting of a single pulsar, as well as a periodogram approach (Jenet et al. 2004;Feng et al. 2019) rather than the full GW analysis pipeline. Here, we have combined these methods in the first search of this type, where we used the entire NANOGrav array of pulsars and full GW search analysis, while incorporating electromagnetic data to conduct a more informed search for GWs from our test source, 3C66B.
Since the report of a hypothesized orbital motion in the core of the galaxy 3C66B by Sudou et al. (2003) (hereafter S03), it has been an ideal test case for searches for GWs from SMBHBs. Using long-baseline interferometry, the authors found apparent elliptical motions in 3C66B's radio core, modeling this motion as the gyration of the jet nozzle due to an orbit-induced precession of the larger black hole's jet. S03 proposed a period and chirp mass for the binary of 1.05 ± 0.03 years and 1.3 × 10 10 M , respectively. Given the relatively small redshift of the galaxy (z = 0.02126), a binary with those properties would be emitting gravitational radiation well within the sensitivity of pulsar timing arrays (PTAs).
As such, 3C66B has long been a prime candidate for continuous GW detection. It was the first object targeted for continuous wave detection, as reported by Jenet et al. (2004) (hereafter J04), in which seven years of Arecibo timing data from PSR B1855+09 (Kaspi et al. 1994) was used to search the Fourier domain timing residuals (commonly referred to as a Lomb-Scargle periodogram), using harmonic summing, for a GW signal consistent with the binary period modeled by S03. Their methodology was to search the Fourier-domain timing residuals (commonly referred to as a Lomb-Scargle periodogram) in addition to employing harmonic summing (Press et al. 1992). With these methods, they did not see evidence of a significant signal, and were able to place an upper limit of 7 × 10 9 M on the chirp mass of the system at a binary eccentricity of e = 0.
Since the study of J04, Iguchi et al. (2010) (hereafter I10), reported a 93-day variability in the active galactic nucleus's infrared light that was interpreted as likely due to doppler boosting of a relativistic outflow that is modulated by orbital motion (its period differs due to geometric effects). The new model assumed the 1.05year orbital period from S03, but predicted an updated chirp mass of 7.9 × 10 8 M , almost a full order of magnitude lower than the upper limit set by J04.
The work reported here presents a Bayesian crossvalidation framework in which we use 3C66B's binary parameter measurements as priors for our continuous wave search. Our search has resulted in the most stringent GW derived limit to date on the chirp mass of 3C66B's candidate SMBHB. We also test, more generically within our search framework, what sensitivity im-provements can be gained by knowing the GW frequency of a target to increasingly good precision. Therefore, we have quantified the improvement made by searching for GWs from a specific source, including cases where the orbital period is only known with large error or not known at all.
Note that because J04 used only one pulsar in their study, they would have been unable to perform a formal experiment to detect 3C66B, as the use of one pulsar precludes the ability to demonstrate the quadrupolar signature that is unique to the influence of gravitational waves. Thus, our study here is the first formal targeted detection experiment for 3C66B using a pulsar timing array.
This paper is laid out as follows: in Section 2, we describe our data, mathematical model, and software pipeline. In Section 3 we report the detection Bayes factor and chirp mass upper limit for 3C66B, as well as results for new test methods. In Section 4 and Section 5, we present our conclusions as well as discuss implications for future detection prospects of this and other SMBHBs.

Pulsar Timing and Electromagnetic Data
We make use of the NANOGrav 11-year Data Set (Arzoumanian et al. 2018a), which provides high precision timing of 45 millisecond pulsars. Only the 34 pulsars with baselines of at least 3 years are used for GW detection analyses (Arzoumanian et al. 2018b). We describe slight differences in the use of the data set in this work as compared to other papers in Section 2. However, the majority of the data are treated similarly to A19. Due to the 11-year timing baseline, the data set is most sensitive to binaries with orbital periods of less than a decade.
The electromagnetic data we incorporate into our models are mainly derived from S03 and I10, as well as the location from the NASA/IPAC Extragalactic Database (NED) 1 . These values are summarized in Table 1. The right ascension, declination, and luminosity distance are taken as constants throughout the analysis, as the PTA sensitivity to sky location and distance is much lower than any associated errors. For consistency with earlier work, we take the luminosity distance of 3C66B to be 85 Mpc, as in S03. Therefore, all cal- any GW strain limit can be converted to the reader's preferred distance by multiplying the strain by d 85 , and M limits by multiplying by d 3/5 85 .

Signal Model
We use the methods presented in A19 for the generation of expected pulsar timing residuals influenced by a signal from a continuous GW from a circular SMBHB. While we will not present the full derivation, we will summarize below the relevant equations needed to follow our analysis on the NANOGrav data and refer the reader to A19 for more detail. Note that throughout this section, equations are written in natural units (where G = c = 1).
Pulsar timing residuals describe the deviation of an observed pulse arrival time from that predicted from a model based on spin, astrometric, interstellar delay, and if needed, binary parameters of the pulsar. These are the basic data product that we use to search for GWs, which will not be included in the pulsar's timing model. Timing residuals for each pulsar are constructed as where M is the design matrix, which describes the timing model, and is a vector of the linearized timing model parameter offsets from the best fit solution. In other words, the timing model, which was originally derived without the presence of a GW, must now be adjusted. We write a vector describing the white noise in the data as n white , and the same for the red noise, n red , which is correlated over long timescales. The noise terms are described in more detail in Section 2.3. The signal s can be derived as follows. For a GW source whose sky location is described by polar and azimuthal angles θ and φ, the strain induced by the emitted GWs is written in terms of two polarizations as where h +,× are the polarization amplitudes and e +,× ab are the polarization tensors, which we write in the solar system barycenter (SSB) frame as (Wahlquist 1987). In these equations, we defineΩ as a unit vector pointing from the GW source to the SSB, written aŝ Ω = − sin θ cos φx − sin θ sin φŷ − cos θẑ .
The pulsar's response to the GW source is described by the antenna pattern functions (Sesana & Vecchio 2010;Ellis et al. 2012;Taylor et al. 2016 and references therein) wherep is a unit vector pointing from the Earth to the pulsar. Finally, we write the signal s induced by the GW, as seen in pulsar's residuals, as Here, ∆s +,× represents the difference between the signal induced at the Earth (the Earth term) and that at the pulsar (the pulsar term), and can be written as where t is the time at which the GW passes the SSB and t p is the time the GW passes the pulsar. 2 These times can be related from geometry by where L is the distance to the pulsar. For a circular binary at zeroth post-Newtonian order, s +,× is given by (Wahlquist 1987;Lee et al. 2011;Corbin & Cornish 2010) s where i is the inclination angle of the SMBHB, ψ is the GW polarization angle, d L is the luminosity distance to the source, and M is the chirp mass, which is related to the two black hole masses as It is important to note that M and ω, in this case, refer to the observed redshifted values. For a circular binary, we relate the orbital angular frequency to the GW frequency with ω 0 = πf GW , where ω 0 = ω (t 0 ). For this work, as in A19 we define t 0 as the last MJD in the 11-year data set (MJD 57387). The orbital phase and frequency of the SMBHB are given by where Φ 0 and ω 0 are the initial orbital phase and frequency. As in A19, we use the full expression for ω(t) to maintain consistency across runs, as this form is needed to model the signal at the higher frequencies sampled in some runs, as described in Section 2.4.3.

Software and Analyses
In this work, we make use of NANOGrav's GW detection package, enterprise 3 , an open-source code written fully in Python that contains a built-in interface with the pulsar timing data and noise models required to perform Bayesian GW analysis (Arzoumanian et al. 2018b, limits and detection). Basic algorithms for Bayesian continuous wave analysis are described in detail in a number of past works (e. g. Ellis 2013; Ellis & Cornish 2016) Using enterprise, we can use a priori constraints on a binary system, which come from electromagnetic observation (for instance, the period of 3C66B) to set priors on GW parameters that are derived from the binary model. Within enterprise we can easily add these priors to the timing model and noise model to obtain a full model of the signal. We then perform Markov-Chain Monte Carlo (MCMC) methods implemented in PTMCMCSampler 4 to find the posterior distribution for each of the free parameters. For 'blind' continuous wave (CW) searches as in A19, we typically set uninformative priors, which are uniform across the allowed range of values, for the binary system's parameters, such as sky location, frequency, mass, and distance to the source. Thus, the methods here could be considered a "targeted" search by our use of informed priors.
For instance, in the simplest treatment of 3C66B, a specific binary model has been hypothesized, with measurements and associated unknowns in the mass, mass ratio, and orbital frequency (e. g. S03; I10). We can use these electromagnetically constrained parameters, in addition to knowledge of the location of this object on the sky, to restrict our priors. 5 Assuming a SMBHB with a circular orbit, a continuous GW signal can be characterized by eight of the following nine parameters: which represent the GW source's: • position on the sky (θ, φ); • GW frequency, related to the orbital frequency at some reference time (f GW ); • orbital phase at some reference time (Φ 0 ); • GW polarization angle (ψ); • orbital inclination (i); • chirp mass (M); • luminosity distance (d L ); • strain amplitude (h 0 ), which is related to the chirp mass, GW frequency, and luminosity distance . The ninth parameter is redundant, as the strain amplitude h 0 can be defined by and can be related to other physical parameters by Since the strain is entirely determined by M, f GW , and d L , a limit on h 0 based on a PTA search can be translated into constraints on these source parameters. Since the uncertainties on θ, φ, and d L are much smaller than the PTA sky localization accuracy, by targeting a specific source with a known position and redshift, we can set these parameters as constant values, and therefore reduce the number of search parameters to six. In all runs, there is also a set of free parameters associated with each pulsar included in the PTA which are varied in the analysis. First of these is the pulsar distance, which has a Gaussian prior in all cases. In pulsars where the distance is reported in Verbiest et al. (2012), the Gaussian is defined using the recognized distance and the associated error. For the remaining pulsars, the Gaussian prior is defined as 1.0±0.2 kpc. As in A19, this assumption can be seen to hold in the posteriors for these pulsars, as the prior is returned in all cases, meaning the data cannot inform on the distances for these pulsars. This is expected, as these pulsars are largely those with shorter observation baselines, which are influencing the PTA to a smaller degree. Also included is the GW phase at the pulsar. While this quantity could be calculated geometrically from the other parameters, including it as a search parameter mitigates potential issues sampling the complex parameter space, which arise due to the large uncertainty on the distances to the pulsars compared to the GW wavelength.
As is standard for these types of analyses, (e.g., Arzoumanian et al. 2018b;Aggarwal et al. 2019) the white noise of each pulsar (described as EFAC, EQUAD, and ECORR) is held fixed. The power spectral density of the pulsar intrinsic red noise is modeled as where A red (the red noise amplitude) and γ (the red noise spectral index) are also allowed to vary in each pulsar in our Markov-Chain Monte Carlo simulation.
Here, f yr is (1/1yr) in Hz. To assist the sampler, empirical distributions of the red noise parameters were made from single pulsar noise run posteriors and used to create jump proposals. These determine how steps in the MCMC are taken through generating proposed samples, and were added to significantly improve sampling and decrease burn-in time for our analyses. For a more detailed description, see Appendix A of A19.
Our treatment of the red noise in one pulsar, J0613−0200, required additional noise modeling. As described in A19, this pulsar possesses extra unmodeled noise processes that, in the 11-year continuous wave search, presented as an increase in strain upper limit at a frequency of 15 nHz. In this work, this manifested as poor sampling in the CW parameters, particularly in f GW . Because of this poor sampling, the f GW parameter would periodically get stuck near this frequency. To mitigate this effect, we applied more sophisticated noise modeling techniques to allow the red noise to deviate from the typical power-law, with corresponding jump proposals to assist sampling. The noise model that was chosen is a t-process spectrum, which allows for 'fuzziness' in the typical power-law spectrum by scaling the power spectral density by a variable factor for each frequency. This model is created by generalizing the typical Gaussian process prior to a Student's t-distribution. This process will be discussed in more depth in Simon (in prep). Even with this model, poor sampling in the f GW parameter still occurred, and can be attributed to unmodeled noise due to changes in the dispersion measure of pulsar J1713+0747, caused by variations in the interstellar medium along the line of sight (Lam et al. 2018;Hazboun et al. 2020). While this pulsar is NANOGrav's most sensitive in general, it is not particularly sensitive to 3C66B, as shown in Figure 1, and thus excluding it did not significantly effect the upper limit on target 3C66B. As such, this pulsar was removed from our search. The above procedure is used for all enterprise runs as described in detail in the next subsection.

Four Distinct Tests
We constructed several separate set-ups for enterprise for the purpose of testing distinct hypotheses. The purpose of each of these, and the difference in procedures within enterprise, is described below.

Detection
To determine if a CW from 3C66B is detected, we conduct an enterprise search using a single frequency, with a value corresponding to the 1.05-year orbital period for a circular binary, making the final set of search parameters Note that the I10 and S03 models make assumptions about the electromagnetic data which may or may not be correct; our model simply tests the presence of a binary SMBH in this system at a period of 1.05 years. The detection prior on M is log-uniform in the range 10 7 to 10 10 M , and is sampled in log-space. This prior is convenient for calculating Bayes factors as a measure of detection significance, using the Savage-Dickey formula (Dickey 1971), Here, H 1 is the model with a GW signal plus individual pulsar red noise, and H 0 is the model with individual pulsar red noise only. The prior and posterior volumes at h 0 = 0 are p (h 0 = 0|H 1 ) and p (h 0 = 0|D, H 1 ), respectively. We are able to apply the Savage-Dickey formula because these models are nested (H 0 is H 1 where h 0 = 0), and p (h 0 = 0|D, H 1 ) is approximated as the fraction of quasi-independent samples in the lowestamplitude bin of a histogram of h 0 . The error in the Bayes factor is computed as where n is the number of samples in the lowest amplitude bin. This process is done once the samples in GW strain are calculated from the directly sampled parameters. In the detection analyses, the red noise amplitude is sampled with a matching prior (log-uniform in A red ). All other GW parameters are searched with a uniform prior.

Upper Limits
To set an upper limit on the chirp mass of 3C66B, we again conduct an enterprise search using a single frequency, with a value corresponding to the 1.05-year orbital period, making the final set as in the previous section (Equation 22). However, in contrast with the case for detection, the upper limit prior on M is uniform (rather than log-uniform) meaning the prior set on the log 10 M exponentially increases over the range {7, 10}. This is done as an astrophysically reasonable prior, as we expect SMBHBs to lie anywhere in this mass range, while still allowing for efficient sampling. Additionally, this prior choice allows the derived upper limit to be as conservative as possible by allowing a higher proportion of high chirp mass samples, and be independent from the choice of lower prior bound. In the upper limit analyses, the red noise amplitude is sampled with a matching prior (uniform in A red ). Upper limits are taken to be the value of the 95th percentile of the posterior distribution. Following the approach of Arzoumanian et al. (2018b), we calculate the error on upper limit calculations as where x = 0.95 and N s is the number of effective samples in the chain, which is estimated by dividing the total number of samples by the autocorrelation length of the chain.

Frequency Prior Testing
In addition to the tests described above of the S03 and I10 models, where the GW frequency is fixed to discrete values as in other continuous wave searches (A19; Arzoumanian et al. 2014), it is also crucial to test frequencies within the confidence region of these values. For this aim, we have developed methods to directly sample in f GW . These include specialized parameter groupings and jump proposals to help the sampler move through the more complex parameter space. Using these techniques, we are able to obtain an upper limit from the M posteriors for a variety of frequency priors from various enterprise setups.
When searching over GW frequency, a log-uniform chirp mass prior is used, and the samples are re-weighted to modify the prior choice from a uniform-in-log distributions of masses to a uniform-in-linear distributions of masses, the latter of which is more common in upperlimit analyses by virtue of insensitviity to the lower sampling boundary. This both assists with sampling and maintains a consistent prior on the GW strain, which is not directly sampled. To match the M prior, a loguniform prior is used on A red . Since we are no longer fixing f GW to a single value, our final parameter set for these searches was In addition, we also chose to limit our GW frequency prior to a range of 1-100 nHz, rather than the 1-300 nHz used in A19. Besides the PTA's insensitivity at these high frequencies, we expect a source to remain in these frequency bins for very little time, with residence timescales as small as months, so their detection prospects are minimal (Burke-Spolaor et al. 2019; Hazboun et al. 2019b).
Using the three priors shown in Table 2, we are able to find re-weighted upper limits for a variety of scenarios. These include: 1. The GW frequency is known, and set to a single value 2. The GW frequency is known with large errors, and the error region is searched over 3. The GW frequency is not known or has extremely large errors, and the entire PTA sensitivity band is searched over. Then, we examined the change in re-weighted chirp mass upper limit as a function of frequency prior width. In addition to allowing for possible errors in the orbital period measured by S03, these widened priors allow us to test the feasibility of this process on a less constrained source. Additionally, if there was any significant frequency evolution in the source, a signal would still have the chance to be detected in either of these setups. In addition to a single value and a uniform prior across the PTA sensitivity bandwidth, we also use 10 times the uncertainty on the predicted frequency. We also bin the samples of the widest f GW search to interpolate be- tween these runs. The results of this examination are described in Section 3, and are summarized in Figure 5

Test of a Specific Binary Model
To directly test the consistency of the model presented in I10 with the NANOGrav data, we create priors for an enterprise run corresponding to the values presented (see the first line of Table 3). For f GW , we are able to use a Gaussian prior, where the error on the measured value from I10 directly corresponds to the standard deviation of the prior. However, M has uneven error bars, so a more complicated prior is needed. Here, we fit a skewed normal distribution to the reported value and error, and construct a skewed normal prior based on this distribution, and also keep a log-uniform prior on A red . Therefore, the final parameter set for this search was To analyze the amount of information gained between the prior and posterior models, we employed the Kullback-Leibler divergence (Kullback & Leibler 1951). We calculate this information gain in bits between the posterior p(x|d) and the prior p(x) as This is done for the distributions for both M and f GW . To maintain consistency between forms of the posterior and the prior, we fit a skewed normal distribution to both posteriors to directly compare to the prior.

Detection
Using the setup for a detection run as described in Section 2.4.1, we find no evidence for a GW signal from 3C66B. We calculate a Savage-Dickey Bayes factor of B 10 = 0.74 ± 0.02. Therefore, there is no evidence for the detection of a GW signal in the data. The posterior for this run is plotted in Figure 2. Shown in orange is the chirp mass upper limit of I10, with the shaded region representing the error on the value. With these methods, the I10 mass estimate is impossible to rule out. We also note that the peak in the posterior at 1 × 10 10 M is not statistically significant.

Upper Limits
As no GW signal is detected from 3C66B, we set upper limits on the chirp mass using the procedure described in Section 2.4.2. Using the constant-value frequency prior at 60.4 nHz (corresponding to the 1.05-year orbital period), we set a 95% upper limit of (1.65±0.02)×10 9 M for M of the SMBHB in 3C66B. This value corresponds to a strain of (2.45 ± 0.06) × 10 −14 . To compare, the expected strain of the model in I10 is (7.2 +6.8 −5.8 ) × 10 −15 .
As can be seen in Figure 3, while we achieve a factor of 4.3 improvement over the limit set by J04, we cannot rule out the I10 model. The posterior distribution of samples does include a peak at about 1 × 10 9 M , which is within the error region for the chirp mass calculated from I10. However, this peak is not statistically significant, and is able to be traced to a single pulsar, J1909−3744. This likely occurs due to covariances between the model and sinusoidal behavior caused by noise processes in the data. Therefore, this peak in the posterior is not indicative of a signal, and our upper limit can be considered robust. We will note that the upper limit listed can be calculated for the reader's preferred distance using the transformation described in Section 2.1.

Frequency Prior Testing
As described in Section 2.4.3, we also performed tests to quantify how much our upper limits might improve if we have constrained (through electromagnetic observation) the orbital frequency of the target. While for 3C66B the orbital frequency is assumed to be known to within small errors, for other targets, a frequency may not be known or be only poorly constrained. This test provides a sense of how well the period must be constrained to provide effective sensitivity gains for a GW search.
Using the three scenarios described above, we are able to characterize the change in re-weighted upper limit between the setups. The result of the log-uniform prior search over the entire frequency band is summarized with Figure 4. The white area represents the area of Mf GW parameter space ruled out in this analysis. From the uniformity of the samples over the parameter space, it is clear there are no sampling issues. This is due to the improved sampling methods described in Section 2.3. The weighted 95% upper limit is plotted for each frequency bin, allowing us to quantify for which frequencies we are the most sensitive to 3C66B.
In addition to the three runs described above, it was also possible to infer the upper limit that would be derived from a run with a frequency prior width between those of the three separate runs. To do this, we bin the samples in the scenario 3 (widest f GW prior) run to keep only a certain range of frequencies and recalculate the weighted upper limit for this subset. These bins increase symmetrically in log space about the value of f GW reported by S03, from a log space width of 0 to 2 (essentially, 2 orders of magnitude in linear space). The weighted upper limits calculated from these binned samples are plotted in Figure 5.  . 2D histogram of samples in the log-uniform prior setup. Also plotted is the weighted 95% upper limit for each frequency bin (blue) from the scenario 3 setup. The white area indicates the section of parameter space ruled out by our search. It is clear from the uniform distribution of samples across all frequency and mass channels that all sampling issues have been resolved. This uniform distribution also makes clear that there is no indication of a signal at the distance and sky location of 3C66B. We only plot the upper half of the parameter space in M to resolve more detail. Below log 10 M = 8.5, all sampling is uniformly distributed, identically to the upper half of the figure. For comparison, the scenario 1 weighted upper limit (orange triange) and I10 chirp mass estimate (red star) are also shown.
Also plotted in Figure 5 are the upper limits from the three individual runs. From the consistency of these points with the calculated curve, it is clear that this technique is robust. Additionally, this shows the feasibility of searching over f gw , as the results are consistent with those calculated for both an individual frequency and a small range.
As can be seen in Figure 5 and Table 2, there is nearly an order of magnitude difference in the upper limits derived from frequency varied runs of different prior widths. However, from the curve calculated from binned samples, we see that this increase does not begin until about one order of magnitude in frequency space about the I10 value is included. This implies that a targeted search such as this is worthwhile even without exact orbital information, as long as the frequency is known to within an order of magnitude.

Test of a Specific Binary Model
To directly test our sensitivity to a GW from the model of 3C66B proposed in I10, we directly test priors as described in Section 2.4.4. In Figure 6, we can see the clear differences between the prior and posterior  Figure 5. Chirp mass upper limits plotted with respect to frequency prior width (blue). Also shown as horizontal lines are previous upper limits set by S03 (green), J04 (purple), and I10 (orange), from top to bottom. Shaded regions describe error bars on the quoted limit. It is clear that none of these upper limits rule out that of I10. However, this figure accentuates the fact that when a period is known to less than 1 order of magnitude of precision, the limits on the target's mass improve by nearly one order of magnitude; that is, while the tightest prior produces the lowest upper limit, moderately wide priors also produce similar results, indicating that perfect orbital models would not be necessary to perform such a search on other systems. It is not until the prior spans approximately an order of magnitude that sensitivity is lost. Also plotted for comparison are the weighted upper limits for each of the three separate runs.
for both f GW and M. These differences are quantified in Table 3, where the error on the posterior values are calculated with the percentiles of the posterior distribution corresponding to 1σ error bars. The values of f GW are consistent with those of the prior, but for M, we are able to significantly lower the upper bound on the value, effectively ruling out part of the high mass region of the model.
Additionally, we report the information gained between the posterior and the prior as described in Section 2.4.4. The differences in the distributions for f GW produce a KL divergence of 0.0096, while those of the M distributions produce a KL divergence of 0.2597. While neither of these values is large, it is clear that much more information is gained about the chirp mass of 3C66B from this model test than the GW frequency.

DISCUSSION
To provide context for the upper limit on 3C66B set in this work, we can compare to the limits set in A19, which do not have the benefit of electromagnetic constraints (i. e. a 'blind' search). This comparison will allow us to estimate the improvement in sensitivity gained by including electromagnetic data over a typical blind search.  Table 3. Values and errors associated with GW frequency and chirp mass.
By comparing our upper limit to the sensitivity curve in Figure 3 of A19, we estimate that we have gained a factor of 2.1 in sensitivity by holding the source position fixed in our search. Note that a much greater improvement comes from knowing the binary candidate's period, as demonstrated in Figures 4 and 5.
With the framework developed in Hazboun et al. (2019b) we can construct detection sensitivity curves to estimate the PTA that will be required to detect or rule out the mass model presented in I10. The hasasia (Hazboun et al. 2019a) package 6 allows us to construct these detection sensitivity curves using a straight forward matched filter statistic and to simulate PTA data with control over the number of pulsars, observing cadence, timing precision, and data length. Using this software to estimate an idealized signal-to-noise ratio (S/N) (see Eqn (79) in Hazboun et al. (2019b)), assuming the parameters in I10 and using the pulsar noise parameters in Arzoumanian et al. (2018a) we obtain S/N = 0.87. We used this software to extend the baseline of the the existing 11-year NANOGrav data set by 6 https://hasasia.readthedocs.io/ adding new data to the existing pulsars with a timing precision and cadence that matches recent data. We also augmented the PTA, adding new pulsars with timing precisions and cadences similar to those already in the array; we added pulsars for each projected year at a rate comparable to the current growth-rate of NANOGrav, which has been approximately 7 pulsars per year for the past three data sets.
We find that NANOGrav should be able to rule out the existence of a SMBHB in 3C66B with the I10 mass within five to eight years from the end of the data set considered here. However, while hasasia allows us to calculate the PTA's sensitivity to a CW at a specific sky location, it is unable to set other parameters (such as luminosity distance) as known due to electromagnetic information as is done in this work. As is discussed above, including electromagnetic information to reduce the parameter space of the GW search allows for an increased sensitivity. Because of this, using electromagnetic information will likely allow us to accelerate this estimated timeline. To more reliably estimate this timescale, detailed simulation work will be necessary to quantify the improvement made by including electromagnetic information over typical searches.
Because the sensitivity of the array depends heavily on the observing baseline of each pulsar, the inclusion of additional data can help tremendously. Data of this sort are accessible through the IPTA (Perera et al. 2019), and followup analyses of 3C66B by the international community could prove fruitful. This timeline to the PTA sensitivity required to confirm or deny 3C66B as a SMBHB will be reduced with the more rapid addition of pulsars to the array, e.g., by adding more than 7 per year. This improvement will be accelerated if the newly included pulsars are near the sky location of 3C66B, as, currently, there are few pulsars in the array that are very sensitive to 3C66B. To accomplish this, pulsar searches should be undertaken near the sky locations of potential PTA targets to begin improving our sensitivity more rapidly. Some pulsars in this area of the sky can be included through use of data provided by the IPTA (Perera et al. 2019), showing once again that an international effort to detect 3C66B could be worthwhile.
In addition to the results for GWs from 3C66B, our work has many implications for detection prospects of other binary candidates. As discussed in Section 3.3 and shown in Figure 5, for 3C66B, it was not until we widened our prior to span an order of magnitude in frequency space on either side of the target frequency that sensitivity was lost. For similar candidates, particularly those at similarly high orbital frequencies, we presume that this result will hold. Therefore, as long as the sky location and luminosity distance of a potential target are known, a search of this type is worth attempting if at least an estimate of an orbital period can be obtained. We will caution that this improvement will differ depending on the sky location of the source, and that the amount of frequency-space that can be effectively searched with this method will be larger for higher-frequency sources. As can be seen in Figure 4, it is the inclusion of samples at low frequencies that raise the upper limit. However, typical errors on binary periods are quite a bit smaller than the limit suggested here, meaning that this method should prove useful for most binary candidates. This method will also account for any frequency error due to unaccounted for frequency evolution of the SMBHB, which, in the case of a detection, would provide important constraints for evolutionary models.

CONCLUSIONS
In this work, we present a new method for performing multi-messenger searches for individual SBMBHs, using 3C66B as a test case. 3C66B was first identified as a binary candidate by S03, and was first visited by PTAs in J04, which ruled out the proposed binary model. In the intervening 15 years, a revised model was published by I10 and PTA data and analysis methods have greatly improved. We used the NANOGrav 11-year data set, as well as the collaboration's flagship GW detection package, enterprise, to search for GWs from 3C66B. Here, we are able to limit 3C66B's chirp mass, at 95% confidence, to (1.65 ± 0.02) × 10 9 M , a factor of 4.3 smaller than the limit set in J04. However, we are unable to rule out the existence of a binary corresponding to the revised model proposed in I10.
In addition to directly placing a limit on the chirp mass of 3C66B for the published orbital period, we are able to quantify how much this multi-messenger approach increases our sensitivity over a typical 'blind' PTA search. We have conducted a search on real data that includes GW frequency as a free parameter, and from this analysis, we learn that by including frequency constraints from electromagnetic binary source measurements to restrict the prior, we can gain approximately an order of magnitude in sensitivity when compared to a frequency-blind search spanning the whole PTA band. However, this drop in sensitivity does not occur until a relatively wide range of frequencies is searched over, meaning that this approach will be useful even for candidates with relatively poor constraints on their orbital periods.

ACKNOWLEDGMENTS
Author Contributions. We list specific contributions to this paper below. CAW led the work on this paper, ran the GW searches, and led the development of the manuscript. JS, SRT, SJV, and SBS provided guidance throughout the project and provided key development of the project motivation and scientific interpretation. JAE, SRT, PTB, SJV, and CAW designed and implemented the Bayesian search algorithms in enterprise. JSH performed and interpreted the S/N simulations with hasasia. RDE performed initial literature reviews on 3C66B. NJC, JSH, DLK, MTL, TJWL, MAM, and CMFM contributed valuable scientific comments. NANOGrav data is the result of the work of dozens of people over the course of more than thirteen years. ZA, PBD, MED, TD, JAE, ECF, EF, PAG, MLJ, MTL, RSL, MAM, CN, DJN, TTP, SMR, PSR, RS, IHS, KS, and JKS developed the 11-year data set. All authors are key contributing members to the NANOGrav collaboration.