The LISA Data Challenge Radler Analysis and Time-dependent Ultra-compact Binary Catalogues

Context. Galactic binaries account for the loudest combined continuous gravitational wave signal in the Laser Interferometer Space Antenna (LISA) band, which spans a frequency range of 0.1 mHz to 1 Hz. Aims. A superposition of low frequency Galactic and extragalactic signals and instrument noise comprise the LISA data stream. Resolving as many Galactic binary signals as possible and characterising the unresolved Galactic foreground noise after their subtraction from the data are a necessary step towards a global fit solution to the LISA data. Methods. We analyse a simulated gravitational wave time series of tens of millions of ultra-compact Galactic binaries hundreds of thousands of years from merger. This data set is called the Radler Galaxy and is part of the LISA Data challenges. We use a Markov Chain Monte Carlo search pipeline specifically designed to perform a global fit to the Galactic binaries and detector noise. Our analysis is performed for increasingly larger observation times of 1.5, 3, 6 and 12 months. Results. We show that after one year of observing, as many as ten thousand ultra-compact binary signals are individually resolvable. Ultra-compact binary catalogues corresponding to each observation time are presented. The Radler Galaxy is a training data set, with binary parameters for every signal in the data stream included. We compare our derived catalogues to the LISA Data challenge Radler catalogue to quantify the detection efficiency of the search pipeline. Included in the appendix is a more detailed analysis of two corner cases that provide insight into future improvements to our search pipeline.


Introduction
Ultra-compact binaries (UCBs) are compact or degenerate star systems with orbital periods of a few hours or less.They emit continuous gravitational radiation with frequencies in the mHz range.Circularised compact binaries of the Milky Way Galaxy are expected to be the most numerous type of gravitational wave (GW) signal below ∼ 5 mHz.Double white dwarfs (WDs) are the most common type of UCB, although UCBs can also involve neutron stars or black holes, some possibly with non-zero eccentricity.
Here we analyse a simulated data set from a future spacebased GW detector LISA (Hils et al. 1990;Amaro-Seoane et al. 2017) and present time-evolving UCB catalogues for 1.5, 3, 6, and 12 months of simulated data.We resolve as many as 10,000 binaries from a 12-month observing period.More than 400 of these are constrained to a sky localisation area of 10 deg2 or better.The simulated data span a 24-month period.Here, we do a full analysis up to 12 months.In the Appendix we discuss a few cases which have poor convergence of the sampler after 12 and 24 month analyses.These corner cases were identified when comparing the the 6 and 12-month catalogues.
LISA is a European Space Agency led mission, in collaboration with NASA and an international consortium of scientists 1 , ⋆ klackeos@mpifr-bonn.mpg.de 1 https://lisa.pages.in2p3.fr/consortium-userguide/designed to explore the uncharted Universe of low-frequency GWs and promises 'answers to fundamental questions in physics and astronomy.' 2 .Ground-based detectors are insensitive to low frequency GWs due to gravity gradient noise from terrestrial sources.One must use space-based detectors to observe frequencies below 1 Hz.
LISA will monitor the observable Universe with a triangular constellation of spacecraft separated by 2.5 million km.Each spacecraft houses two free flying test masses (TM) and two lasers linking the two other spacecraft.Heterodyne laser interferometry will be used to observe picometer level changes in TM separations (Weise et al. (2017)).The observatory lies in a plane inclined 60 • with respect to the ecliptic and will be in a heliocentric orbit with a period of one year.This arrangement is sensitive to GWs spanning four decades in frequency from 0.1 mHz to 1 Hz (Baker et al. (2019)).A sampling of LISA's most anticipated sources are a stochastic GW background (SGWB) from cosmological sources (Christensen (2019)), late-time coalescence and merger of massive black hole binaries (MBHB) (Klein et al. (2016)), and stellar (Sesana (2016)), intermediate and extreme mass ratio inspirals (Amaro-Seoane et al. (2007)).Finally, there will be the unanticipated or, yet, unknown astrophysical sig-parameter values.However, for targeted searches of known EM binaries (for example, known in optical, X-ray, γ-ray, or radio) this limitation can be mitigated by designing priors informed by the EM observations.Thousands of detached-binaries and a few ten to hundreds of interacting systems will be detected (Nissanke et al. (2012)).These systems provide unique laboratories for fundamental physics.For example new bounds on the graviton mass competitive with those produced by ground based GW observatories and pulsar timing arrays are expected using eclipsing (Cooray & Seto (2004)) and eccentric (Jones (2004)) compact binaries.Modifications to GR in low velocity regimes can be constrained with high mass white dwarf binaries (Littenberg & Yunes (2019)).With an observed population of relativistic binaries the Milky Way (MW) potential will be mapped using GWs (Adams et al. (2012)), and regions previously obscured by intervening material will be revealed (Korol et al. (2018)).Models of MW globular cluster formation and evolution will be further constrained (Benacquista & Downing (2013)).In Danielski et al. (2019) the authors discuss the prospect of gravitational waves affirming the existence of post-main sequence exoplanet and brown dwarf populations in the MW.Breivik et al. (2020) demonstrate that a catalogue of thousands of MW UCBs will help constrain binary star formation and evolution.Finding a WDs binary system near the threshold of merging would reveal new insights into the precursor physics of Type Ia supernova (Webbink (2010)).Multi-messenger astrophysics of compact binary stars will be enriched with a GW perspective.Follow-up electromagnetic (EM) searches of newly detected LISA binaries will confirm relativistic binaries missed by traditional MW globular cluster searches (Kremer et al. 2018(Kremer et al. , 2019)).For known multi-messenger binaries, that is systems that have been observed electromagnetically, joint EM-GW observations will provide improved physical constraints on masses, orbital parameters and dynamics, beyond what independent EM or GW observations achieve on their own (Shah & Nelemans 2014;Littenberg & Cornish 2019a).On classes of UCBs unobserved electromagnetically, Sberna et al. (2021) predict with simulation and semi-analytic evolution models that WD-black hole binaries will be detectable and could inform follow-up searches in X-ray.LISA detections of Galactic black hole/white dwarf-neutron star binaries will inform and increase the computational efficiency of radio searches for pulsars in these systems (Kyutoku et al. 2019;Thrane et al. 2020).It is also possible to use UCBs as phase/amplitude standards for self-calibration of the data (Littenberg (2018)).There is also a technique to use the WD binary annual modulation to extract an isotropic astrophysical SGWB (Adams & Cornish (2014), Lin et al. (2022)), which depends on first resolving and remove as many UCBs as possible.There is even a proposal to use UCBs as a GW timing array to indirectly detect GWs in the low frequency regime (nHz to µHz) (Bustamante-Rosell et al. ( 2022))!Looking even further to the future, UCB analysis has been investigated using a coherent network of at least three independent space-based gravitational wave detectors (Zhang et al. (2022)).
Gravitational waves from individual and populations of UCBs are interesting in their own right, i.e. beyond considering them a foreground noise source.Their characterisation and extraction is an integral part of the 'global fit' solution (Littenberg et al. (2020b)) for extragalactic source detection at mHz frequencies.Listed above are just a few reasons why the analysis presented here is vital to achieving the widest possible scientific impact for the LISA mission.The main motivation of this paper is to further test and develop the Galactic Binary Markov Chain Monte Carlo (GBMCMC) search pipeline in preparation for a global fit.
The individual sections of our paper are as follows.In Section 2 we provide an introduction to the likelihood used in our analysis, the noise and signal models, and the UCB parameterisation.The computational resources used are also discussed there.We present GBMCMC catalogues for the observation times analysed in Section 3.For each catalogue we make various signal-to-noise (S /N) cuts to the data, and identify welllocalised UCBs to target for multi-messenger studies.In addition to resolving as many UCBs as possible, we quantify the efficacy of our search pipeline by comparing the GBMCMC catalogue to the LDC Radler UCB catalogue, classifying each catalogue UCB as either matched, confused or false alarm (Section 4).Lastly, we summarise our results in Section 5.In Appendix A, two catalogue UCBs classified as confused are examined in more detail.These serve as case studies for future development of the sampler.

GBMCMC analysis
The GBMCMC search pipeline (Littenberg et al. (2020b)) uses Bayesian model selection to optimise the number of detectable UCBs.In a nutshell, GBMCMC performs a global fit to the resolvable binaries using a trans-dimensional (reversible jump) (Green (1995)) MCMC algorithm with parallel tempering (Swendsen & Wang (1986)).At the same time, it fits a model to the residual confusion noise.
Parallel tempering is used to prevent the sampler from becoming trapped in sub-dominate modes of the posterior, by sampling with parallel chains of different temperatures, with exchanges of parameters between chains subject to detailed balance.Higher temperature chains are more freely able to sample the parameter space.For example, a chain given an infinite temperature will simply sample the prior distribution (Littenberg & Cornish (2010)).
Trans-dimensional MCMC algorithms addresses the model selection aspect of the problem.The MCMC stochastically transitions between models, where each model contains a different number of UCBs, while satisfying detailed balance.Therefore, the number of iterations the chain spends in a particular model is proportional to the marginalised likelihood, or evidence, for that model.Before saying more about the waveform and noise models, we describe a likelihood for Bayesian inference adapted for LISA science analysis.
For our analysis we used 100,000 MCMC steps, after the burn-in phase.Convergence time depends critically on sampling from customised proposal distributions (Littenberg et al. (2020b)).In the Appendix we discuss a few cases which have poor convergence of the sampler after 12 and 24 month analyses.

Likelihood and noise constructions
The three LISA spacecraft communicate with each other via laser links forming the interferometric arms of the detector.The arms of a space-based detector will have different lengths, varying on the order of a few percent of their length over the course of a year.This occurs due to the solar wind, the gravitational coupling of the Earth-moon system and the influence of the other planets in the solar system on the spacecraft orbits, causing the test masses to deviate from their Keplerian orbits.In an equal arm detector, laser frequency noise experiences the same delay in each arm and will cancel at the detector.For time-varying armlengths, Time-Delay Interferometry (Prince et al. 2002;Adams & Cornish 2010) (TDI) has been developed to algorithmically remove the otherwise dominating laser frequency noise by generating virtual equal-armlength interferometers, performed on the ground in post-processing.In general, many different TDI combinations of interferometer output signals, or observables, are possible (Estabrook et al. (2000)).For the LISA mission two quasi-independent Michelson interferometer data streams and a third null-stream (the LISA "Sagnac" observable) will be constructed in post-processing 5 .
The likelihood function (1) depends on the TDI observables, or 'channels', used in the algorithm.The LISA signal is a superposition of two parts: the frequency response of the I th channel to all the gravitational wave signals incident on the detector, h I , and the combination of all the noise sources impacting that channel, n I : d I = h I + n I .The "noise" term is a superposition of instrument noise and gravitational wave signals that are individually too quiet to extract from the data (forming a confusion noise below ∼ 5 mHz).The detectable gravitational wave signal is recovered using a signal model h I such that the residual r I = d I − h I is consistent with the noise model.For Gaussian noise the likelihood is: where C is the noise correlation matrix.Indices k and m correspond to the data samples for the I and J channels respectively, where there is an implicit sum over the TDI channels I = {X, Y, Z} and data samples.
If the fluctuations in the data are stochastic the noise correlation matrix in terms of frequency becomes partially diagonalised C (Ik)(Jm) = S I J ( f k )δ km , where S I J ( f ) is the cross-power spectral density between channels I, J Adams & Cornish (2010).Since the noise levels are equal and uncorrelated on each spacecraft, noise orthogonal TDI variables I ′ = {A, E, T } (Prince et al. (2002)) are constructed such that the cross-spectral density matrix is diagonalised by performing a linear transformation in the space of TDI variables.See Adams & Cornish (2010) for complete expressions for the instrument noise contributions to the cross spectra S I J ( f ), where the realistic scenario of unequal noise levels in each spacecraft is treated.
The {A, E, T } combination also results in signal orthogonality for frequencies below the inverse round-trip light travel time along the arms of the instrument (c/2πL), f * ≃ 19.1 mHz, such that A ∼ h + , E ∼ h × (corresponding to the two virtual Michelson interferometer channels) and T ∼ h ⊙ (the null channel).This is not restricted to equal arms; Adams & Cornish (2010) derived a combination that maintains this insensitivity for unequal arm length detectors.The gravitational wave response of the null T channel is highly suppressed for f < f * .For this reason the Sagnac data combination is particularly valuable for noise characterisation and the detection of stochastic backgrounds (Tinto et al. 2001;Hogan & Bender 2001) and unmodelled signals (Robson & Cornish (2019)).
For this analysis we have made a number of simplifying assumptions, to be relaxed in the future.We take the noise to be stationary and assume the noise correlation matrix is diagonal in the frequency domain.In reality, the confusion noise is cyclo-stationary, with periodic amplitude modulations imparted by LISA's orbital motion (Seto (2004)).Here we neglect offdiagonal terms in the frequency domain noise correlation matrix C. Since an overwhelming majority of signals have frequencies well below frequency f * , we only use the A and E data combinations in the analysis and we assume that the noise in these channels is uncorrelated.The instrument response includes finite arm-length effects of the LISA constellation and arbitrary spacecraft orbits.The TDI prescription currently implemented treats the arm lengths as equal and unchanging with time, saving on computational cost.We split the analysis into 3317 subbands [ f i , f i + N/T obs ], where f i,min = 0.00813 mHz.The noise in each band is approximate as an undetermined constant S i .The noise level in each band becomes a free parameter explored by the Reversible-jump Markov chain Monte Carlo algorithm, resulting in a piece-wise fit to the instrument noise over the full analysis band.

Compact binary parameterisation and signal model
A compact binary orbit is modelled with eight parameters, N P = 8, λ → (A, f 0 , ḟ , φ 0 , ι, ψ, θ, ϕ), where A is the amplitude, f 0 is the observed GW frequency, which is twice the orbital frequency of the binary, ḟ is the (constant) time derivative of the GW frequency, φ 0 is the initial GW phase at the observation start time, ι is the inclination of the orbital plane relative to the line of sight, the wave polarisation axes in the Solar System barycentre are determined by ψ, and θ, ϕ are the ecliptic latitude and longitude, respectively.See Shah et al. (2012) for a description of parameter correlations and degerneracies.
When at least 90 % of ḟ MCMC samples are positive, we take the orbital evolution to be GW-dominated evolution and use (2) to estimate the chirp mass M and luminosity distance (D L ) of the binary.
We also have optional settings to include the second derivative of the frequency (Littenberg & Cornish (2019b)) as an additional parameter, in which case the frequency derivative is no longer constant, so the parameter ḟ → ḟ0 is fixed at the same fiducial time as f 0 and φ 0 .
The detector response of the I th data channel to the signal from a galactic binary with parameters λ a is h I (λ a ), and the superposition of individual UCBs forms the signal model: (3) The number of binaries in each sub-band, N GW , is a priori unknown, and has to be determined from the analysis.A probability distribution for N GW is established, essentially producing a catalogue for each dimension.Here we build our catalogue with the dimension having the highest Bayesian evidence.Individual binary systems are modelled as isolated point masses on slowly evolving quasi-circular orbits.Orbital eccentricity (Seto (2001)), tides (Fuller & Lai (2012)) and third bodies (Robson et al. (2018)) are not included in our model.The signals are modelled using leading order post-Newtonian waveforms, and the instrument response h I is computed with the fast-slow decomposition method, in terms of frequency (Cornish & Littenberg (2007)).FastGB, or the "fast-slow" method, is the decomposition of the relative path length variation of the detector arm, δl(t)/L.Namely, the product between the rapidly varying exp(iω 0 t), where ω 0 is the instantaneous angular frequency of the GW, and a slowly varying amplitude factor that depends on the LISA spacecraft orbits and the GW amplitudes, that is part of the LISA instrument response.
The LISA instrument response is modulated as the detector rotates about its centre and orbits the Sun.The sensitivity pattern is anisotropic, so as the detector moves the sensitivity pattern evolves with time with respect to a given source.This modulation is imprinted on the detected amplitude.Additionally, as LISA orbits around the Sun, the frequency of the GW is Doppler-shifted, resulting in a time-dependent phase shift of the instrument response (Peterseim et al. (1997)).Both effects introduce a spread in the power of the source such that it is no longer monochromatic when viewed from a LISA-based frame and depends on the direction and orientation of the source.The power is reduced relative to the instrumental noise as it is spread over a series of side-bands, offset from the GW frequency at integer multiples of the modulation frequency f m = (1 yr) −1 .The subdominant harmonics lead to secondary maxima in the likelihood surface which are dealt with using tailored proposals (Crowder & Cornish 2007;Littenberg 2011).The detector velocity with respect to the Solar System barycentre evolves with time.Therefore, the phase modulation depends on a coupling between frequency and sky location.This effect helps one localise the source on the sky and determine its orientation, because each source has a unique modulation pattern.GBMCMC implements multi-modal proposals to account for degeneracies and symmetries in the likelihood surface to improve chain convergence time (Littenberg et al. (2020b)).

Data segmentation and computational resources
Four time periods were searched over: T obs = 1.5, 3, 6 and 12 months, each with the same starting time.A catalogue is produced for each time period searched.The catalogue data include a point estimate of the UCB parameters, the waveforms, and posterior distributions for the parameters λ.The search is done in terms of frequency, where the full LISA band is divided into 3317 'analysis segments' or 'frequency segments'.The LISA band is divided into frequency segments of width 2 5 /T 1.5mo , 2 6 /T 3mo , 2 7 /T 6mo and 2 8 /T 12mo for the 1.5, 3, 6 and 12-month analyses, respectively.The f min and f max for each of the 3317 analysis segments are the same for each T obs .In Figure 1, we see how the waveforms in a particular frequency segment evolve with observing time.After 12 months of analysis all of the LDC Radler UCB signals in this segment have been recovered.An example of frequency spreading due to detector motion is also apparent in Figure 1, where the waveforms do not appear monochromatic.
Each analysis segment is padded with data amounting to the typical bandwidth of a source.This creates a certain amount of overlap with neighbouring analysis segments.This allows the MCMC to explore the data in the padded region, which is especially useful for sources with long tails that extend beyond the hard boundary of the analysis segment.During catalogue production, only samples fitting sources in the original analysis window are retained.This prevents the same source from appearing in the catalogue more than once.In the next section, we discuss how the raw chain samples from the MCMC analysis are sorted into individual UCB catalogue entries and present the results of our analysis in the form of evolving catalogues as a function of T obs .
To perform the wholesale analysis of ∼ 3000 frequency segments in a reasonable amount of time, we used Amazon Web Service (AWS) cluster computing resources 6 .Each segment was analysed in an 'embarrassingly parallel' way, such that there is no communication between segments being analysed.There is interest in using cloud-based computing resources for the actual LISA mission, so our analysis is a first test run in using this infrastructure.
GBMCMC and the software used to produce the catalogues are downloadable from GitHub (Littenberg et al. (2020a)).Our catalogue data are available upon request.Additionally, a Python package dedicated to exploring these data is available (Thorpe et al. (2021)).

The recovered GBMCMC-Radler catalogue
The development of UCB catalogues for the LISA mission is the process of transforming the search data products into a form that is useful to the greater astronomy community.In our case, this means filtering the GBMCMC parameter chain outputs into individual catalogue entries using the maximum likelihood model.Before moving on to the catalogue results, we summarise the process of filtering posterior samples to construct catalogue entries.The details of this process are also found in Littenberg et al. (2020b) and Littenberg & Cornish (2023).
To build a catalogue for a particular frequency segment, we start with the highest evidence model chain.Namely we select the N GW -source model which has the highest evidence.The correlation, or overlap, value between waveforms h(λ i ) and h(λ j ) for binaries with parameters, λ i and λ j , Equation (4), is used as a metric to cluster parameter samples.
We set a threshold M * = 0.5 above which parameter sets are interpreted as describing the same UCB template.
The first N GW samples in the chain correspond to the first N GW UCB catalogue entries, in no particular order.Waveforms for these first N GW entries are generated and used as reference entries for cross-correlating with other samples waveforms.If the correlation value between a given sample waveform and a reference entry waveform exceeds the default threshold value of M * , the sample parameters are appended to the entry parameters.Correlations are only computed when a sample has an f 0 that is within a range of 10/T obs bins of the reference entry f 0 .If a chain sample is not within range of an existing entry or does not have a correlation M i j > M * with an existing entry, a new entry is created and added to the list of reference entries to be matched against.This process continues until all chain samples are grouped.
Each entry has an associated evidence that is used to further filter the number of entries.The evidence for an entry p(d) = p(d|λ) dλ is proportional to the total number of chain samples.The evidence for an entry is computed as the number of chain samples in the entry divided by (N total /N GW ), where N total is the total number of samples in the chain.A threshold evidence of 0.5 must be exceeded for a particular entry to be included in the final catalogue.The filtered parameter chains for each entry are used to form additional catalogue products.An entry's point-estimate is chosen to be the sample that corresponds to the median of the marginalised posterior on f 0 .We also compute the full multimodal N P × N P covariance matrices for each mode of the posterior.These are then used for covariance matrix proposals as more data are acquired.From the point-estimates of each entry, waveform entries are computed.Finally, metadata about the catalogue are stored including the total number of above-threshold entries, their weights and S/N and the full set of posterior samples for each entry.History data are also included, which simply links catalogue entries with preceding T obs catalogue entries, if such a link exists.This is dependent on the correlation of entry waveforms with preceding T obs catalogue entry waveforms.A default threshold of M * = 0.5 is again used to make an association, but the user can adjust this as needed.
The numbers of catalogue UCBs for our 1.5, 3, 6 and 12 month LDC analyses are shown in Table 1, along with UCB number as a function of different S /N cuts in Table 2.In Table 3 we make various parameter cuts on catalogue UCBs that have frequencies above 5 mHz.Catalogue UCBs with at least 90 % of their samples meeting the particular parameter cut criteria are included in the count.In Figure 2 we show parameter posteriors, f 0 , A, sky location, ḟ , ι, for a well-localised eclipsing UCB as a function of observing time.In Figure 3, we graph the chirp mass and luminosity distance posteriors using Equation ((2)).The LDC values of chirp mass and distance, 0.4004M ⊙ and 14.39 kpc, are within 1σ uncertainty of the the derived chirp mass and distance, 0.4073 +0.0040 −0.0057 M ⊙ and 14.12 +0.86 −0.77 kpc.In Figure 4 we show the power spectral density for the full frequency band that was analysed.The residual becomes smoother with time, and one can clearly see the excess confusion noise below ∼4 mHz.In Figure 5 are joint posteriors for sky location in ecliptic coordinates, for each observation time, 1.5, 3, 6, and 12-months, from left to right starting from the top.The sky location posteriors are graphed using every tenth sample.
One use of the catalogue data are posterior-based proposals for individual UCBs for use in future global fits to the LISA data.Updates to UCB parameters are proposed, independently of other UCBs in the catalogue.The cadence of applying new proposals to the global fit is to be determined for the LISA mission.
For this analysis, we applied covariance matrix proposal updates to the 3, 6, and 12 month analysis, respectively using the 1.5, 3 and 6-month catalogue UCB parameter distributions.Along with a given catalogue, history tree data are produced linking the UCBs of a catalogue to UCBs in the preceding T obs catalogue.The history data will be useful to the observer when determining which catalogue UCBs are potentially confused.Where this is especially useful is for two UCB nearby in frequency and sky location, as discussed in the second corner case of the Appendix.
The overlap integral (4) used in the catalogue production step is also used to cross correlate LDC injected waveforms and GBMCMC catalogue waveforms, to determine which catalogue UCBs have a matching LDC injection, with M i j > 0.8.We see the results of this in the next section where we compare our catalogues to the population of LDC injections to quantify the efficacy of our search.Table 2: Same as 1 but with S/N cuts on M i j > 0.8 catalogue UCBs (top) and catalogue UCBs with 0.5 < M i j < 0.8 (bottom).Catalogue UCB that have M i j < 0.5 are not included in the S/N cut.

Comparing the GBMCMC catalogues to LDC injections
Since we are dealing with a simulated data stream which comes with the parameter values describing the waveforms for every LDC UCB in the data, it is possible to check the efficacy of our search pipeline by cross correlating our catalogue waveforms with the LDC waveforms using Equation ( 4).The meth-ods and results of this process are presented next.These results inform the efficacy of our search pipeline across observing time.
Each catalogue UCB is classified as a matched, confused or false alarm detection.The different classifications are explained below.When the correlation coefficient M i j of a catalogue−injection pair exceeds a threshold of 0.8 we regard these as a 'matched' pair.There is typically one LDC injection meeting this criterion for the given catalogue UCB.Though it occurs less often, we shall see that it is possible for one catalogue UCB to have a match with two injections.In general, we refer to catalogue UCBs that have a match with one LDC injection, or more, as matched.
The majority of catalogue UCB that are not matched are classified as confused.Confused catalogue UCB can be further distinguished as two sub-categories of blending: (1) two UCBs each have a positive overlap with the same injection and the sum of the two UCB waveforms has a larger overlap with that injection; (2) a single UCB has a larger overlap with the sum of two injected waveforms than with each injected waveform alone.
A histogram of the number of catalogue UCBs in each of the three categories: matched, confused and false alarm, as a function of observing time is shown in Figure 6.The fraction of catalogue UCBs which have a match with an LDC injection is 0.74, 0.88, 0.82, and 0.79 for the 1.5, 3, 6 and 12-month catalogues, respectively.The fraction of catalogue UCBs which are confused is 0.26, 0.12, 0.18, and 0.21 (for 1.5, 3, 6 and 12-month catalogues).The larger the fraction of confused sources in each catalogue, the smaller the fraction of matched sources.Figure 7 shows the cumulative distribution of cross correlation values between the T obs -month catalogue UCB and LDC injections.The fraction of catalogue UCBs with match below M i j = x is shown on the vertical axis, and one notices the fraction of 1.5-month catalogue UCBs below the match threshold of 0.8 is significantly larger than the other T obs catalogues.For such a short observation time, a larger fraction of confused sources is expected.
Even though blending occurs more often for confused catalogue UCB, it is possible for matched UCB to exhibit blending also.For matched 12-month catalogue UCBs we find the second type of blending occurs exclusively.In Figure 8, we show the sky location of blended 12-month catalogue UCBs with M i j > 0.8 and in the bottom graph are the sky locations for blended catalogue UCBs with correlation values in the range 0.5 < M i j < 0.8.The blended UCBs are represented as 'x' markers with a black border.In each graph, the underlying distribution of points are the catalogue UCBs meeting a correlation threshold of M i j > 0.8 and 0.5 < M i j < 0.8, for the top and bottom graphs, respectively.
In Figure 9 we show the Af 0 plane of all matching catalogue UCBs in blue.Catalogue UCBs that do not have a match are graphed with a colour-bar indicating the largest correlation value with an LDC injection.Each graph displays a different correlation range.The top graph highlights non-matching UCBs with M i j < 0.5 and the bottom graph highlights non-matching UCBs with 0.5 < M i j < 0.8.From both graphs, it is clear that nonmatching catalogue UCBs are primarily below 5 mHz.Some of these catalogue UCBs also suffer from blending, that is they also have M i j < 0.5 with nearby LDC injections that are matched with other catalogue UCBs, which is one reason to set a higher match threshold, in our case to 0.8.
There is one more type of catalogue UCB that is not classified as confused or matched, according to the categories described above.UCBs are classified as a 'false alarm' when no LDC injection exists within 10/T obs of the UCB frequency.How- Table 3: Same as Tables 1 and 2, but with different cuts on the catalogue UCBs with frequencies above 5 mHz to identify UCBs worthy follow-up analysis by EM observatories.A catalogue UCB is added to each column when at least 90% of its MCMC samples satisfy the given condition.In the first cut we state the number of UCBs that are 'well-localised', with sources contained within a sky location area of 10 deg 2 .In the next two columns, we have the number of UCB with positive and negative frequency derivatives.The number of eclipsing binaries is shown in column 5, and we further subdivide this category into the last two columns.Eclipsing with GW dominated frequency evolution, and the number of UCBs which are eclipsing and well-localised.From this final category we select a high frequency 12-month catalogue UCB, as an example of a target for follow-up EM observations and archival searches.The black marker on the graph corresponds to the D L and M c derived by substituting the injected LDC parameters into (2).See Figure 2 for a description of the three colours.
ever, the false alarms identified in our analysis are due to boundary effects.As T obs becomes larger, the bandwidth of a UCB signal is also wider, leading to long waveform tails.When a long waveform tail also extends beyond the boundaries of an analysis segment, into a neighbouring analysis segment, a false alarm can emerge in the neighbouring segment's catalogue.In postprocessing we explored the highest frequency catalogue UCB false alarm in the 12-month catalogue.This false alarm UCB is located at a f 0 ∼ 16.64 mHz.This frequency is on the boundary between two analysis segments and the false alarm UCB waveform template overlaps with the waveform tails of two matched, or recovered, bright UCBs on either side of the boundary.Further examination of the other false alarm UCBs in the catalogue data reveals that each is the symptom of this boundary effect.This symptom is alleviated by allowing communication between neighbouring analysis segments as they run in parallel, such that the residual curve is consistent across the boundary (Littenberg & Cornish (2023)).
We identified all 12-month catalogue UCBs that have M i j > 0.8, are eclipsing and well-localised and graph their chirp mass as a function of distance and sky locations in Figure 10.From the group of matching, eclipsing and well-localised 12-month catalogue UCBs, we selected the lowest frequency source and graphed its parameter posteriors in Figure 11.It has ι = 88.50 • ± 0.17 • and an orbital period of ∼8 minutes.The frequency derivative is constrained at 12 months and is positive ḟ = 63.4 +11.6 −8.8 × 10 −17 s −2 .Galactic sky location in degrees is (θ, ϕ) = (−61.70+0.18  −0.13 , 195.55 +0.16 −0.13 ).The distance to the system is relatively close at 1.37 +0.25  −0.22 kpc, and its chirp mass is 0.751 +0.074 −0.078 M ⊙ .Even though this recovered UCB has a high correlation value (M i j = 0.9995), the joint A-ι posterior to the injected amplitude and inclination reveals that these parameters were not accurately recovered.No correlation is visible in the Aι plane, indicating insufficient sampling of the parameter space.Moreover, this 12-month UCB system is also found in the 1.5, 3 and 6-month catalogues with correlation values > 0.8.For reference, this low frequency catalogue UCB is overlapping with more than ∼ 30 UCBs.All of these are packed within a frequency range of only ±2 8 /T 12mo .There are 12-month catalogue UCB with even lower frequencies, and with S /N ≳ 8, but have poorly constrained sky location and distances due to confusion with tens of thousands of unresolved UCB.More observation time is required to determine if the lowest frequency systems are suitable for EM follow-up observations.Posteriors for a distant (D L = 16.1 ± 2.4 kpc) and reasonably well-localised (22 deg 2 ) UCB in the 12-month catalogue, with a correlation value M=0.99 and S/N of 36, are shown in Figure 12.One sees that GW parameters A and ι are strongly correlated.This high S/N 12-month UCB system is matched in the 1.5, 3 and 6-month catalogues, with S/N = 2, 18, 22, respectively.The 3D location of this binary system places it in the part of the Milky Way that is inaccessible to optical telescopes due to intervening dust and gas obscuration, called the Zone of Avoidance.LISA will complement radio and infrared surveys in providing a new view to this part of the Milky Way Galaxy.
In the Appendix, we put the data obtained from comparing the GBMCMC catalogues to the injections to further use by examining a few 12-month frequency segments that contain confused catalogue UCBs.In particular, we explore catalogue UCBs that have a match at six months but do not appear as matched UCBs in the 12-month catalogue.Each corner case involves blended 12-month catalogue UCBs of the first type discussed in the beginning of this section.

Discussion
This paper is a report on the analysis of the LISA Data Challenge Radler data with the GBMCMC code.This work is a necessary step in our efforts to prepare for the global fit required for the LISA mission.The LDC simulated data stream contains millions of white dwarf binary signals.We divided the time series data into different observation time increments, T obs , of 1.5, 3, 6 and 12 months and produce catalogues for each T obs by performing a global fit to the resolvable binaries.
GBMCMC, is a trans-dimensional reversible jump MCMC algorithm with parallel tempering.The MCMC sampling was done in frequency, where the full frequency band has been divided into a total of 3317 frequency segments.Bayesian inference is used to select the highest evidence model to build catalogues for each of the 3317 frequency segments as a function of observing time.These are all combined for each observing time to create 1.5, 3, 6 and 12-month catalogues.
The UCB catalogue waveforms are then cross correlated with the known LDC injected waveforms to determine the efficacy of our search pipeline.For each observation time, we quantify the number of matching, confused, and false detections in the catalogues.We recover more than 10,000 binaries after 12 months of observing.We found that 7,255 of these 12-month catalogue UCB have a match with an LDC injection (with a correlation value M i j > 0.8) and have S/N > 7. Of these, there are 128 UCBs with a S/N > 100.
We identify two interesting corner cases for in-depth followup analysis.For the first Appendix corner case we investigate two confused 12-month catalogue UCBs, the sum of which are a match with a single LDC injection.Moreover, the two catalogue UCBs have a common ancestor in the 6-month catalogue.Namely, they both have a match value greater than 0.5 with a single 6-month catalogue UCB.This indicates that the 12month MCMC analysis has not converged.Smaller time-jumps between catalogues is a target for future investigation.
In the second corner case, we examined a different type of confused UCB in the 12-month catalogue.There are two UCBs occupying the same region of the Af 0 and θ-ϕ planes.In this region of parameter space, there are two LCD injections.After a 12-month analysis neither of the injections are separately a match with the confused UCBs; however, the sum of the UCB waveforms is a match with the sum of the LDC injected waveforms.Similar to the previous corner case, we find the 12month MCMC analysis for this analysis segment has not converged.Additional incremental analysis is necessary between the 6-month and 12-month catalogues to disentangle these two LDC injections.
These corner cases are part and parcel of a broader discussion of a strategy for creating and publishing catalogues.Namely, how often will the catalogues be updated and released, and the type and format of information dispensed in the catalogues.Alerts and low-latency analysis and outputs will be in-  formed by our time-evolving catalogues.Further in-depth analysis of blended catalogue UCBs in data challenges will be essential for answering some of these questions.Checking the convergence of analysis segments that have catalogue UCBs which share the same parent or are close in sky location and frequencyamplitude will be necessary.Accurate UCB science and low frequency GW science in general depend heavily on validation of the data products output from search pipelines.
We make one last comment regarding future work.Binary UCBs that make up the Radler LDC simulated data stream have zero eccentricity, which is likely the case for most resolvable binaries.However, eccentric UCBs do exist, of course, and one of the future upgrades to GBMCMC will be to incorporate eccentric UCB models into GBMCMC.The harmonics of low fre-Fig.7: Cumulative distribution of catalogue UCB matches over all frequency with correlation values greater than 0.5.Observation times between 1.5 and 12 months are represented.
quency eccentric binaries will be put to use in the search routine.Fig. 8: Sky location of blended 12-month catalogue UCBs.In the top and bottom graphs, the sky location of blended 12-month catalogue UCBs are highlighted here as dark pink and yellow crosses with a black border.The underlying distribution of points are all catalogue UCBs meeting a certain correlation threshold.This threshold is M i j > 0.8 in the top, and the range of 0.5 < M i j < 0.8 is used in the bottom graph.The highest concentration of points is near the galactic centre.The overlapping dark pink markers represent catalogue UCBs that have a match with more than one injection (the second type of blending described in Section 4).Each of the, more sparse, yellow markers represent a catalogue UCB that fits the same injection as another catalogue UCB (the first type of blending described in Section 4).For M i j > 0.8, the first type of blending is absent in the 12month catalogue.One can see that most of the blended catalogue UCBs are near the galactic centre.

Fig. 1 :
Fig.1: Waveform evolution as a function of observing time.Individual waveforms are plotted over the input A-channel PSD (solid, red curve), noise (constant, black curve stretching across the frequency window) and residual (dashed, blue curve) curves.The top graphs show a single 1.5-month (left panel) and 3-month (right panel) catalogue waveform, in green.The bottom panels show, in total, three 6 and 12-month catalogue waveforms, two of which did not appear in any earlier catalogue.All injected signals in this frequency window were recovered in a 6 and 12 month analysis.

Fig. 2 :
Fig. 2: A well-localised, eclipsing and chirping UCB for EM follow-up.The LDC parameter values are shown as black markers on the GBMCMC posteriors.The 1σ and 2σ posterior curves are graphed in this corner plot.Colours pink, purple, and green are the 3, 6, and 12 month posteriors, respectively.

Fig. 3 :
Fig. 3: Luminosity distance and chirp mass for a well-localised, eclipsing and chirping UCB.The 12-month posteriors ḟ and A the binary from Figure 2 were re-sampled using Equation ((2)) to form the luminosity distance D L and chirp mass M ⊙ posteriors.The black marker on the graph corresponds to the D L and M c derived by substituting the injected LDC parameters into (2).See Figure2for a description of the three colours.

Fig. 4 :
Fig.4: The power spectra graphs for the analysed LISA band, for the 1.5, 3, 6, and 12-month catalogue.The red curve is the A-channel input data, and the dashed, blue curve is the residual, after the catalogue UCBs have been subtracted.The black curve plotted on top of the data and residual is the noise level.

Fig. 5 :
Fig. 5: A Molweide projection of the sky location posteriors for 1.5, 3, 6, and 12-month catalogue UCBs in galactic coordinates, with the same ordering as Figure 4. Every tenth posterior sample has been used to construct the sky location graphs.

Fig. 6 :
Fig. 6: Number of catalogue UCBs with observing time, using a logarithmic scale.A catalogue UCB is classified as matched (with a single catalogue UCB and injection satisfying M i j > 0.8), confused or false.

Fig. 10 :
Fig. 10: Chirp mass versus distance for eclipsing and well-localised catalogue UCB that are matched.The graph on the left shows chirp mass versus distance for all matched 12-month catalogue UCBs that are eclipsing and have positive ḟ , coloured by GW frequency.Eclipsing UCB are defined here as having more than 90% of their inclination angle samples constrained within 70 • < ι(deg.)< 110 • .The graph on the right shows the ecliptic coordinates of the UCB from the left graph, now coloured by distance.These are plotted over all 12-month catalogue UCB that have M i j >0.8 (light blue).

Fig. 11 :
Fig.11: The 1 and 2σ parameter posteriors for the lowest frequency and 12-month matching catalogue UCB that is eclipsing and localised to within 10 deg 2 on the sky.The black diamonds on the 2D posteriors represent the LDC injected parameter values.One can see that the amplitude and inclination angle have not been recovered, and no correlation is visible in the A-ι plane.Increasing the number of MCMC steps and searching over a longer observation time are needed to better determine these parameters and recover the expected correlation between them.

Fig. 12 :
Fig. 12: The 1 and 2σ parameter posteriors for a distant (D L = 16.1 ± 2.4 kpc) and 12-month matching catalogue UCB that is localised to within 22 deg 2 on the sky.The black diamonds on the 2D posteriors represent the LDC injected parameter values.In galactic coordinates, the source line of sight is ∼ 3 degrees offset from the galactic centre.At a distance of ∼ 16 kpc, this places the UCB behind the galactic centre in a region inaccessible to optical telescopes.

Fig
Fig. A.1: Corner case I: The 1 and 2σ posteriors for the f 0 -A and sky location planes for two non-matching catalogue UCBs, for the 24 month analysis.The injected parameter values are displayed as black markers.

Fig
Fig. A.2: The PSD for corner case 1.The A-channel power spectral density of the sum of confused catalogue UCB waveforms (LDC0058924130+LDC0058923825) is in orange, and the LDC injected waveform (INJ005892423) that matches with the summed catalogue waveforms is the solid blue curve.The original data are in red, and residual is plot as a dashed blue curve.One can see from the red curve that there are at least two high S/N UCBs on either side of the graphed waveforms.Except for INJ005892423, all injected waveforms have been recovered.Case Central frequency Number of LDC Number of catalogue Number of matches of segment (mHz) injections UCBs 3mo 6mo 1yr 2yr 3mo 6mo 1yr 2yr 1. 5.89600 7 2 6 7 -2 6 5 -2.5.21647 13 4 11 12 12 3 11 10 10

Fig
Fig. A.3: Corner case 2: The 1 and 2σ posteriors for the f 0 -A and sky location for two non-matching catalogue UCBs, for the 24 month analysis.The injected signal values are the black markers.

Fig
Fig. A.4: The PSD for corner case 2. The solid blue curve is the sum of LDC waveforms INJ00520923 and INJ00520931.The waveform sum of the two catalogue UCBs in orange is a match with the sum of LDC waveforms.No other combination is a match.One can see from the red curve that there are high S/N UCBs on either side of the graphed waveforms.With the exception of INJ00520923 and INJ00520931, all injected waveforms have been recovered.

Table 1 :
Total number of GBMCMC catalogue UCBs as a function of observation time.
Article number, page 16 of 17