Enhancing Gravitational-Wave Science with Machine Learning

Machine learning has emerged as a popular and powerful approach for solving problems in astrophysics. We review applications of machine learning techniques for the analysis of ground-based gravitational-wave detector data. Examples include techniques for improving the sensitivity of Advanced LIGO and Advanced Virgo gravitational-wave searches, methods for fast measurements of the astrophysical parameters of gravitational-wave sources, and algorithms for reduction and characterization of non-astrophysical detector noise. These applications demonstrate how machine learning techniques may be harnessed to enhance the science that is possible with current and future gravitational-wave detectors.


Introduction
In 2015, the Advanced LIGO (LIGO) [1] and Advanced Virgo (Virgo) [2] Gravitational-Wave (GW) detectors made the first observation of a GW signal from a stellar-mass Compact Binary Coalescence (CBC) system [3,4]. Nine additional Binary Black Hole (BBH) mergers and one Binary Neutron Star (BNS) merger were observed by LIGO and Virgo during the first two advanced detector observing runs (O1/O2) [5]. The six month-long O3a run (April 1st -October 1st, 2019) and the recently completed O3b run have yielded tens of new BBH, BNS and Neutron Star-Black Hole (NSBH) detections [6,7] and detection candidates [8]. The current rate of detections is expected to increase in future observing runs, as LIGO and Virgo approach their design sensitivity and additional detectors such as KAGRA [9, 10] and LIGO-India [11] join the network of ground-based GW observatories. The improved sensitivity of the GW network will allow scientists to gain insights into the origins and astrophysical distributions of CBC GW sources [12], test general relativity [13], and measure cosmological parameters such as the Hubble constant [14]. It may also lead to the discovery of GW signals from new source types, such as core-collapse supernovae (CCSNe) [15] and magnetars [16], or a stochastic GW background of cosmological or astrophysical origin [17].
Despite all the initial successes, the future of GW astronomy is facing many challenges. Processing and analyzing the increased rate of detections in future observing runs will require researchers to streamline current search pipelines. Refined astrophysical investigations of GW sources and tests of the fundamental nature of gravity will require precise reconstructions of GW signals and accurate estimates of their statistical and systematic errors. Identification and mitigation of instrumental and environmental data artifacts will require the development of fast and efficient methods for detector and signal characterization.
Machine Learning (ML) algorithms are novel methods for tackling these issues. The LIGO and Virgo Scientific Collaborations run searches for modeled and unmodeled astrophysical transient signals, as well as searches for continuous GWs from isolated compact objects and searches for a stochastic background of GWs of cosmological or astrophysical origin. These searches rely on different techniques, such as matched filtering [18,19], time-coincident detection of coherent excess power between multiple detectors [20,21], and cross-correlation methods [22]. Because of the effectiveness of ML algorithms in identifying patterns in data, ML techniques may be harnessed to make all these searches more sensitive and robust. Applications of ML algorithms to GW searches range from building automated data analysis methods for low-latency pipelines to distinguishing terrestrial noise from astrophysical signals and improving the reach of searches, as well as the statistical significance of detections.
In this paper, we review the ML-based techniques that have been developed by LIGO and Virgo scientists to improve the analysis of GW data. In recent years, ML has gained popularity among LIGO and Virgo researchers thanks to advances in detection and classification of noise transients [23,24,25,26,27,28,29,30,31,32], searches for CBC systems [33,34,35,36], parameter estimation of transient signals [37,38,36,39,40], noise removal [41] and citizen science projects [42]. We also examine the potential of ML to improve GW science as current detectors approach their design sensitivity and other detectors join the GW network of ground-based observatories.
The structure of the paper is as follows. In Section 2 we review the ML algorithms which have been developed to improve the quality of LIGO-Virgo data. In Section 3 we review ML techniques which aim at improving the modeling of GW signals. In Section 4 we describe how ML can be used to improve the sensitivity of GW searches. In Section 5 we review methods for parameter estimation of GW signals and source population inference. Conclusions are given in Section 6.

Algorithms for gravitational-wave data quality improvement
The output of a GW detector is a temporal series of the detector strain, h(t). The sensitivity of an ideal detector is determined by the physics inherent to the instrumental design and is limited by fundamental, irreducible noise sources, such as quantum noise of the laser light, thermal noise of the mirror coatings and optic suspensions, or seismic noise [1,2]. The sensitivity of a real-world detector is also limited by the presence of technical noise sources of different origins, related for example to the feedback control systems that are needed to maintain the systems in operation, or by instrumental and environmental disturbances. Often these noise sources are non-stationary, i.e., their statistical properties vary over short or long time scales. In some instances, noise sources might be stationary at their origins, but couple to the detector strain in a nonlinear way.
The non-stationary, non-Gaussian nature of the data and the presence of noise artifacts may impact data quality or detector performance, and increase the false alarm rate of searches. Short-lived environmental and instrumental noise disturbances, known as detector "glitches," may also affect low-latency detection and parameter estimation of astrophysical transient signals, as evidenced by the GW170817 BNS detection [5]. These non-stationary and nonlinear transients, as well as continuous noise signals at given frequencies in the form of spectral lines, are two examples of major factors affecting the performance of GW searches [43].
Astrophysical signals have typical amplitudes comparable to the detector background noise. Therefore, characterization and reduction of detector noise is essential to GW searches. Some of the techniques for the identification and mitigation of LIGO-Virgo data quality are described in Ref. [6].

Machine learning for h(t) glitch characterization and classification
The first step in the characterization of detector transient noise is to distinguish the glitches from potential astrophysical signals and then classify them into different families. This task can be tackled by extracting features from each glitch time series and mapping these features to the target glitch types. However, glitches often exhibit a complex  [44]). ML algorithms can help identify the origin of these glitches and increase the sensitivity of GW transient searches.
temporal and frequency evolution that can make their characterization with a fixed number of features very difficult. Moreover, the increasing sensitivity of the detectors may lead to a larger number of glitch morphologies. ML can help solve the problem of glitch classification. Earlier works on a variety of unsupervised ML methods [45,46] and neural networks [47] have shown that ML algorithms can be very effective.
Deep Convolutional Neural Networks [48,49] (CNNs) are extremely promising for glitch classification based on time-frequency representations. CNNs are designed to extract features from 2D matrices, such as images, and use these features for classification purposes. Feeding time-frequency transforms, Omega Scans [50], and Q-transforms [51] to a CNN-based deep network is an effective approach to glitch classification. Ref. [30] implements an image-based detection and classification pipeline built on 2D CNN layers. Tests with GW simulated signals show that this method provides a ∼ 99% accuracy in classification and differentiation of glitches from chirplike signals. CNNs typically provide higher accuracy of distinguishing glitches with similar morphology with respect to other ML approaches.
Along the same lines, Ref. [52] uses a Wavelet Detection Filter [53] to extract the features from the input data set. The algorithm performs glitch classification with a boosted gradient method [54] which could be suitable for real-time analysis.
One critical step in supervised learning approaches is the availability of labeled glitch samples. The citizen science project GravitySpy [42] addresses this issue. Based on the Zooniverse platform, GravitySpy leverages the advantages of citizen science and ML to design a socio-computational system to analyze and characterize transients in GW data. The citizen scientists classify images of glitches such as those shown in Figure  1. The glitch categories resulting from this manual classification process are then used as labels for supervised ML approaches [44]. In addition, as the citizen scientists identify new categories of transients, the ML algorithms are re-trained to take into account the new categories.

Glitch characterization and classification with auxiliary channels
The LIGO and Virgo detectors record data streams from a large number of subsystems controlling different aspects of the instruments and monitor their state. These auxiliary channels include data from a variety of instrumental and environmental sensors, such as photodetectors and seismometers. These sensors can witness noise sources which couple to the interferometers. Their data can be used to diagnose and mitigate nonastrophysical couplings. An example is shown in Fig. 2.
A full, manual analysis of LIGO and Virgo auxiliary channel data is generally impracticable because of the huge number of instrumental and environmental monitoring sensors, amounting to several tens of thousand per interferometer. The power of ML to handle huge data sets proves invaluable in analyzing auxiliary channel data.
Methods for glitch identification based on auxiliary channel data have been extensively investigated within the LIGO and Virgo collaborations [27,55,56,57,58]. In these approaches, the GW channel is generally used to determine labels for the training samples while the glitch identification process relies only on information from auxiliary channels. Once the model is trained, it is fully independent of the GW channel data and considers only parameters computed from auxiliary channels known not to be related to astrophysical signals.
The first study of canonical ML algorithms within a glitch detection framework such as Random Forests (RF), neural networks, and support vector machines, was published in Ref. [27]. Since then, multiple authors have investigated ML algorithms to infer the presence of non-Gaussian noise in GW data through features in auxiliary channels.
iDQ is a glitch detection pipeline that can produce real-time data products in lowlatency [59,60]. iDQ decomposes the problem of glitch identification into a 2-class classification scheme within a supervised learning framework with several asynchronous tasks: training, cross-validation, calibration, and low-latency prediction. The pipeline utilizes glitch features in auxiliary channels generated in real-time [51,61] to construct supervised-learning training sets from recent data labeled by witnessing noise in the target channel, typically taken to be h(t). It then automatically trains a variety of ML algorithms to identify glitches in the target channel.
iDQ operated in real-time throughout the first three LIGO-Virgo observing runs, providing probabilistic statements about the presence of glitches in LIGO data and their auxiliary witnesses.
iDQ's Ordered Veto List [55] algorithm contributed to the rapid release of the GW170817 BNS event by autonomously identifying the glitch coincident with the GW trigger in the LIGO Livingston detector [62] and in multiple auxiliary witnesses within 8 seconds of the event first being reported.
The LIGO Scientific Collaboration developed several additional promising approaches for auxiliary channel-based glitch identification in high-latency settings. These approaches typically use different auxiliary features, ML algorithms, and channelsets.
Two fast algorithms which aim at identifying the origin of glitches and can be used with minimal tuning to track their causes are based on RF and Genetic Programming (GP) [63]. RF and GP methods are interpretable, easy to use and tune, and can work with relatively small data sets without the inherent risk of overfitting. The algorithms require as input a list of times when a specific class of noise transients occurs and rely on features which are directly drawn from the numerical metadata generated by real-time LIGO-Virgo data quality pipelines, such as Omicron [64]. This approach minimizes the feature generation step of the process, which is the typical bottleneck for ML-based glitch investigations. The data sets are assembled with features from noise transients and randomly-selected background triggers. The algorithms can be quickly trained and run on LIGO-Virgo computing clusters. A typical analysis with a number of noise transients of the order of a few thousands can be completed in minutes. The methods were validated in Ref. [63] on two sets of h(t) glitches with known origin from the first two LIGO-Virgo observing runs.
Another ML tool that utilises auxiliary channel information is Elastic-net based ML for Understanding (EMU) [58]. EMU uses data from the full list of LIGO's auxiliary channels per detector site. The algorithm uses logistic regression with elastic net regularization [65,66] to predict the probability of a glitch. Instances where a glitch is predicted to be present are classified as 1 ("glitch") while instances where a glitch is predicted to be absent are classified as 0 ("glitch-free") with a continuum of certainty between these limiting values. As other algorithms described above, EMU provides a measure of the auxiliary channel significance in predicting h(t) glitches. This may enable researchers to uncover instrumental and environmental noise couplings to the GW data stream that can be used by commissioners to eliminate the instrumental root of the glitches. EMU's initial performance is illustrated in Ref. [58]. It was also characterized using automatically clustered subsets of glitches, which can be derived according to frequency, duration, and other trigger generator parameters, or by using existing methods to identify classes of glitches [67].
Ref. [68] uses ML regression and clustering methods to infer peak ground velocities from Earthquake Early Warning alerts. The algorithm is trained on archival seismic data to determine the ground motion and the state of a GW interferometer during an earthquake. The estimated ground velocity is then used to forecast the potential effect of earthquakes on the detector in near real-time making it possible to switch the detector control configuration during periods of excessive ground motion.
Hey LIGO [69] is an ML-based information retrieval tool which aims at supporting the commissioning and characterization efforts of GW observatories. The algorithm responds to an user query by searching the detector open source logbook data and returning information on detector operation, maintenance, and characterization tasks. The Hey LIGO web application incorporates a natural language processing-based information retrieval system that can also perform visualization of the user-queried data.

Methods for non-stationary noise subtraction and denoising
ML techniques can be used to identify, model and subtract technical noises that couple in non-stationary or nonlinear ways. In particular, ML algorithms can be designed to construct filters for nonlinear noise subtraction. The output signal measured by LIGO and Virgo interferometers consists of the sum of three components (1) noise which can be removed, (2) noise which cannot be removed, and (3) GW signals. Unfortunately, the noise sources which can potentially be filtered out do not necessarily couple linearly into the interferometer. Therefore, even after Wiener filtering [18], there exists a substantial amount of nonlinear couplings polluting the output. Leveraging the ability of ML to infer nonlinear functions, environmental and control data streams can be used as input of neural networks to find the transfer functions of the systems producing nonlinear noise in the detector output. The trained network can then be used to subtract those nonlinear couplings from the output data and lower the total noise floor. The above method was implemented, for example, in DeepClean [70] and NonSENS [41].
An especially interesting case is when the noise source is monitored by one or more available fast signals and its coupling can be described as linear on short time scales, but with coupling transfer functions that vary on longer time scales. The method can be illustrated by considering the effect of the longitudinal control noise due to the signal recycling cavity feedback, which couples to the detector strain in a frequency region that spans between about 10 and 300 Hz. The coupling is modulated on time scales longer than about 1 second by the residual angular motion of the mirrors. In this case and similar circumstances, it is possible to efficiently track the time-varying noise coupling using the interferometer angular control signals and develop a stable, parametric model of the noise polluting the detector strain [41]. This model can be used to perform a time-domain subtraction of the noise that outperforms the performance of any linear and stationary scheme. This non-stationary noise subtraction scheme was successfully applied to LIGO data during the first half of the third observing run. The method allowed removal of the non-stationary power supply line coupling that was limiting the detector sensitivity at the mains frequency of 60 Hz and in a 4 Hz-wide band around 60 Hz due to sidebands created by the coupling modulation [41].

Gravitational waveform modeling
LIGO and Virgo searches for GWs for CBC systems which rely on a matched filtering analysis [82,83,84,85,86] and the estimation of source parameters,performed with Bayesian inference [87,88], require gravitational waveform templates. Waveform template are needed to compute the Signal-to-Noise Ratio (SNR) and the significance of GW triggers, and the posterior probability distribution of the signal parameters. Figure  3 shows an example of a BBH signal in whitened detector data.
Accurate solutions of the Einstein equations for the two body problem can be obtained with Numerical Relativity (NR) simulations. However, high computational cost required to produce NR waveforms prevents the production of waveform catalogs spanning the full CBC parameter space [89,90,91,92,93,94,95,96,97]. The main parameters that describe a quasi-circular binary system of rotating BHs are the mass ratio and the spin vectors of the two objects. Neutron stars have additional internal degrees of freedom that are described by their tidal deformability, and generic compact binaries can also inspiral on eccentric orbits which further increases the dimensionality of the binary parameter space. Since binary parameters of GW events are a priori unknown their matched filter analysis with GW detector networks requires models of the emitted GWs that smoothly vary as a function of binary parameters, rather than solutions at isolated points as provided by NR simulations. It is important to emphasize that waveform models need to be accurate and computationally efficient to use them in detection and parameter estimation pipelines. Good accuracy in terms of the overlap, the normalized cross-correlation maximized over time and phase shifts, between a waveform model and the most accurate waveforms available (usually NR simulations, or NR simulations stitched together with inspiral waveforms) is crucial to avoid missing events in searches or mis-measuring binary parameters. On the other hand, waveform models need to be fast to evaluate since searches and parameter estimation require tens to hundreds of millions of waveform evaluations and the cost rises for lower mass binaries with hundreds of GW cycles in band.
The coalescence of a binary system starts with an inspiral phase corresponding to a large separation of the two objects where the GWs emitted can be computed with the post-Newtonian (pN) formalism [98]. As the objects come closer, the orbital velocity becomes comparable with the speed of light and the pN expansion breaks down. In this strong gravity regime NR simulations are needed to compute the emitted gravitational radiation during the merging of the objects. If the remnant is a BH the final object relaxes to a Kerr BH under the emission of a sum of damped quasi-normal modes [99] in the so-called "ringdown" phase.
LIGO and Virgo rely on approximate solutions that are traditionally obtained through the effective-one-body (EOB) or phenomenological modeling approaches. In the EOB formalism [100,101] the PN inspiral information is re-summed and calibrated to NR data, and the merger-ringdown part is obtained from a phenomenological fit to NR data. EOB models [100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115] first solve the orbital dynamics by solving a complex system of ordinary differential equations and then obtain the waveform as a second step. Phenomenological models [116,117,118,119,120,121,122,123,124,125,126] provide full gravitational waveforms by tuning an extended PN inspiral expansion and separate analytical functions for the merger and ringdown waveforms to NR simulations that have been combined with PN or EOB inspiral waveforms, and then smoothly combining these three regions. Surrogate or reduced order models of NR or EOB waveforms [127,128,129,130,111,131,132,133,134,135,136,137] can significantly accelerate these waveforms while keeping high accuracy. They have become indispensable tools for GW data analysis over the past several years. Surrogate models fit interpolated decomposed waveform data pieces over the binary parameter space and also compress the waveform in time or frequency.
The above modeling approaches rely mostly on traditional polynomial interpolation or fitting techniques for model construction. In the past few years, more advanced ML techniques have started to be explored for model building. Gaussian process regression (GPR) can be considered as a generalization of a multivariate normal distribution and allows to not only build a model of input data, but at the same time obtain a measure of the uncertainty in the data. GPR enables us to infer the waveform at points of the parameter space not covered by numerical relativity by assuming a joint Gaussian distribution between the known and predicted values, then computing the posterior probability of the predicted parameters. GPR fits have been used to build surrogate models of non-precessing BBH systems with iterative refinement [134], as well as surrogates of precessing binary NR waveforms [136,138]. The usefulness of including the uncertainty on the waveform in models is the capability to marginalize over waveform uncertainty in Bayesian estimations of the source parameters [139,140].
A comparative study of regression methods used for fitting or interpolating waveform data in waveform models covering both traditional polynomial methods, GPRs, and, for the first time, artificial neural networks has been carried out recently [141]. The increasing use of ML in GW source modeling motivated a study comparing the performance of ML approaches against traditional regression methods. In [142], the authors carefully investigated the accuracy, training, and execution time of ML methods (GPR and ANN) against linear interpolation, radial basis functions, tensor product interpolation, polynomial fits, and greedy multivariate polynomial fit. This study addressed the question of whether a more sophisticated and complicated method is necessary to build a gravitational waveform model for aligned-spin and precessing binaries. They concluded that sophisticated regression methods, especially ML, are not necessarily needed in standard GW modeling applications, although ML techniques might be more suitable for problems with higher complexity, like fully spinning black holes.
A key input in the design of waveform models is the mass and spin of the remnant BHs, entirely predicted from the initial BHs parameters by general relativity and computed in numerical simulations. The parameters of the remnants from nonprecessing binary systems are traditionally determined with explicit fits to numerical relativity results [143,144], a procedure that has been extended to determine the final spin magnitude for precessing systems [145]. The description of fully spinning BHs however requires a larger dimensional space, where ML methods are particularly suited to capture the complex relationship between the initial and final parameters. For this reason, the determination of remnant BHs mass, spin and recoil velocity for precessing systems has been performed independently with GPR and Deep Neural Networks (DNN), both methods showing increased accuracy compared to existing fits [146,147].
GWs from the remnants of BNS mergers can be used to place constraints on the neutron star equation of state. However, the NR simulations that are currently used to model the merger and post-merger stage of such GW signals are computationally expensive to generate and of limited accuracy, thereby restricting the ability to use them to perform parameter estimation of candidate GW detections. A hierarchical ML algorithm trained on NR simulations was developed in Ref. [148] that can quickly generate gravitational waveform amplitude spectra with mean overlaps of 0.95, and can also be used to place constraints on the quadrupolar tidal deformability given sufficient SNR in the post-merger signal.

Gravitational-wave signal searches
Searches for GW signals in ground-based detector data are typically split into four different types, depending on their search strategy. The first type includes CBCs. BBH, BNS and NSBH are the most common type of sources and the only type of source detected to date. The second type, which we refer to as GW "bursts", includes transient sources with an unknown or partially modeled signal morphology, as for example, the ones produced by CCSNe. The third source type includes long-duration, continuous GWs that may be produced by an individual rotating neutron star. The fourth type is a stochastic background of GWs, which could consist of remnant GWs from the Big Bang or distant unresolved CBCs. In this section, we review ML approaches that have been developed to enhance LIGO and Virgo searches for these different types of GW signals.

CBC searches
Matched filtering [18] is a common approach to search for sources from CBCs in LIGO and Virgo data [6]. Matched filtering works by first taking as input the raw calibrated strain data. Then a template bank of waveforms is generated spanning a large astrophysical parameter space. The matched filter searches then produce a list of GW triggers by cross-correlating the GW data with the waveforms in the template bank divided by the interferometer's spectral noise density. Signal consistency tests are done on triggers above a certain SNR threshold generated from the matched filtering algorithm in order to determine the time-frequency distribution of power in a trigger. This distribution is compared to what would nominally be expected from the power in the matched filtering waveform. The comparison is done by splitting the template up into a number of frequency bins, such that each bin contributes an equal amount of power to the total matched filter SNR. An additional statistic is then constructed to compare the expected to the measured power in each bin [149].
Recently, there have been developments in exploring how random forest ML algorithms may be used as an alternative to standard CBC signal detection techniques. Using a ranking statistic derived from a random forest of bagged decision trees it was reported that sensitivity improvements were achieved on the order of 70 ±13 % − 109 ±12 % compared to matched filtering [150]. This study was carried out as an intermediate mass BH and a stellar-mass BBH search ( 25 M ). In Ref. [151], the random forest classifier was instead trained/tested on simulated single-detector NSBH events and the authors use hand-crafted features derived from a bank of inspiral templates as input. The classifier was able to detect 1.5 − 2 times as many signals as those those found by standard matched filter detection techniques at low false positive rates as compared to the standard "reweighted SNR" statistic, and does not require the chi-squared test to be computed. The results from both of the studies discussed above, show that random forest ML approaches have the potential to produce higher detection efficiencies for CBC GW sources.
Other promising approaches to CBC signal identification have been proposed by several groups by applying deep learning-based methods [33,34,35,36]. CNN algorithms have been applied in multiple studies towards the search for GWs from CBC signals. CNNs were applied to simulated BBH signals and real LIGO events in non-Gaussian noise, as well as non-Gaussian noise alone in an attempt to classify such signals [152]. This analysis reported that CNNs perform as well as the standard matched filter method in extracting BBH signals under non-ideal conditions. CNNs and matched filtering were also compared in Gaussian noise, where matched filtering is thought to be ideal [35] and were able to match the efficiency of a matched filter analysis. These studies address the fundamental question of the feasibility of deep learning application to CBC GW searches [35,36].
A different conclusion is reached in Ref. [153], where CNN methods for CBC searches are shown not to provide an accurate measure of the detection statistical significance. The latter can be achieved in matched filter searches by sliding the data of one detector in time by an amount which is larger than the typical GW travel time between detectors. A measure of coincident events after the time slides is used to estimate the search background and the significance of detection candidates. CNN algorithms should implement an accurate statistical measure of the background to take current, ML-based CBC searches from proof-of-principle studies to production search codes.

Burst searches
A GW burst is a short duration signal with an unknown or partially modeled waveform morphology due to complicated or unknown astrophysics. Potential sources of GW bursts are CCSNe [154], pulsar glitches [155], NSs collapsing into BHs [156], cosmic string cusps [157], and many others. The uncertainties in signal morphology make it difficult to produce training sets for GW burst signals.
One of the standard LIGO-Virgo GW burst search methods is the coherent Wave Burst (cWB) pipeline [20,21]. The cWB algorithm uses a Wilson-Daubechies-Meyer wavelet transform to measure the excess power in the time-frequency domain that occurs coherently between multiple detectors. A maximum likelihood approach is then applied to determine the probability of a signal being present in the data and produce a list of GW detection candidate triggers. This model-independent method is well-suited for searching for signals with unknown morphology. The sensitivities of burst searches are generally more affected by short-duration glitches that may occur coincident in time between multiple detectors. No detections other than CBC sources have been reported by cWB unmodeled searches to date [158].
Currently there are no published ML searches for generic GW bursts. However, there have been ML searches for specific burst source types. In Ref. [159], the authors employ a neural network algorithm to reduce the impact of glitches on the cWB burst search and increase the significance of the CBC signals which are detected by the pipeline.
In recent years, multi dimensional simulations of CCSNe have produced a selection of GW signal predictions from CCSN explosions [160,161,162,163]. However, some of these simulations include approximations of the required input physics that may result in artificial changes in the GW emission. Moreover, many simulations are ended before the peak GW emission time due to lack of computational resources. Despite these issues, common features in the time-frequency GW emission have recently been identified by various CCSN simulation codes. This has allowed researchers to produce approximate models for a wider range of the CCSN parameter space than can be explored with full CCSNe multi-dimensional simulations [164]. These approximate models allow us to explore supervised ML techniques for CCSN searches.
In Ref. [165], the authors apply a CNN to searches for GW bursts from CCSNe. Training and testing are performed with a parametrized phenomenological waveform model which is designed to match the most common features predicted by CCSN simulations. The method uses 100 different parameterizations of the phenomenological model. cWB pre-processing data is used to prepare images which are fed into the CNN. Red-green-blue (RGB) images are produced to determine the number of detectors where a signal is present: Red (R) for LIGO Hanford, green (G) for LIGO Livingston and blue (B) for Virgo. The algorithm is shown to improve the efficiency of cWB in its standard configuration.
A technique to reduce the background of searches for galactic CCSNe in singleinterferometer configurations was developed in Ref. [32]. The method is based on a supervised evolutionary algorithm, Genetic Programming (GP). The procedure assumes that the event time and the distance of the CCSN are known from neutrino and optical observations. The GP algorithm is first trained on off-source data to produce a multivariate expression of the trigger features which is used as a cut to lower the search background. The multivariate expression can then be applied to on-source windows around a GW event candidate to increase the detection confidence. The effectiveness of the method was tested by injecting the set of waveforms used in the latest LIGO-Virgo CCSN search into 1.47 days of O1 data. The features are extracted from the standard cWB pipeline. The GP algorithm is then used to classify the triggers and remove the noise triggers. The outcome of the procedure is an increased statistical significance of GW candidate triggers that leads to a reduction of the SNR needed for a detection at 3σ confidence level by a factor ∼ 3.
In Ref. [166], the authors train a CNN on waveforms obtained by 3D simulations of neutrino-driven CCSNe, phenomenological sine-Gaussian waveforms, and scattered light glitches. A wavelet detection filter (WDF) [52] with time-domain whitening [167] is used to extract GPS triggers from the simulated Virgo and Einstein Telescope noise backgrounds. Whitened time series and spectrograms are used as input of 1D and 2D CNN algorithms to classify signal and noise classes. The method is then tested on CCSN models removed from the training sets. In a multi-label classification scheme, the CNN is capable of distinguishing among the individual CCSN and glitch classes at different SNRs. In Ref. [168], the authors train a CNN algorithm on the time series of CCSN waveforms and classify signals with two different explosion mechanisms.

Continuous wave searches
Narrow-band Continuous GWs (CWs) from spinning deformed neutron stars [169] have not yet been observed in data recorded by LIGO and Virgo. Although the expected waveform signatures for this type of signal are well known, their small amplitude makes them extremely hard to detect, and any search needs to process long stretches of data. As the number of parameter space points to search grows with the observation time, the sensitivity of current searches is limited by the available computing power. Several established search methods exist with different trade-offs between sensitivity, robustness and computational efficiency. For searches covering wide parameter spaces, analysis pipelines usually follow a hierarchical approach [170] with multiple stages. The first stage evaluates a detection statistic over a dense grid of candidate waveforms. Additional stages often include clustering of candidates to reduce computational waste, vetoing of known types of instrumental artifacts, and follow-up with increased resolution and/or observation time. See Ref. [171] for a recent review.
The huge computational cost of conventional CW searches makes ML approaches a promising alternative as they are fast at searching once training has been completed. In addition, many current follow-up methods also involve manual tuning steps, for which extensive simulation campaigns must be repeated on each new data set and search setup [e.g 172,173]. Robust and flexible ML solutions would reduce this additional human and computational effort. At this point, several teams have started to explore two avenues of using ML methods in CW searches: (i) using ML algorithms as drop-in replacements for parts of a CW analysis pipeline that is still based on traditional grid search methods for the initial search stage; (ii) full ML analysis of the initial strain data.
Among the first category, Ref. [174] presents an application of deep learning in the classification of CW signal candidates. A conventional initial search, based on the F-statistic matched filter pipeline -TD-Fstat search [175] (see the documentation in [176]), generates a large number of multi-dimensional candidate signal distributions that have to be further analyzed. Then 1D and 2D versions of a CNN classifier are implemented, trained, and tested on a broad range of signal frequencies, corresponding to the reference frequency of the narrow-banded data. The training set contains Gaussian noise, simulated CWs from spinning triaxial-ellipsoid neutron stars, and stationary lines mimicking detector artifacts [177]. The authors of [174] demonstrate that these CNNs correctly classify the instances of data at various SNRs and frequencies, while also showing concept generalization, i.e., satisfactory performance at frequencies the networks were not trained on.
For another F-statistic search pipeline, the hierarchical semi-coherent analysis [178,179,180] running on the distributed computing Einstein@Home project [181], Ref. [182] employs a deep learning replacement for traditional clustering methods [183,184,185]. Clustering has the purpose of reducing the number of candidates from the initial search stage, enabling deep follow-up [184,173] at acceptable computational cost, thus improving the sensitivity of the overall pipeline. In [182], they train a region-based CNN [R-CNN, 186, 187] on real output of an Einstein@Home search [188] on Advanced LIGO O1 data. They demonstrate a high detection efficiency at low false alarm rate for sufficiently strong signals, and investigate the scaling of this performance with signal strength. They identify the R-CNN's brittle response to instrumental disturbances in the data as both a problem and an opportunity; since in the current implementation it already distinguishes these from normal noise, they expect that a more completed classification into three classes of normal noise, disturbances and signals can be pursued.
In a more radical approach than the works above, the authors of Ref. [189] apply a deep neural network to search for CWs from unknown spinning neutron stars on the raw time-series data. A CNN is trained on Gaussian noise with signal injections and compared to a matched filter search [190]. The analysis shows that the method is competitive with matched filtering, at least under these idealized noise conditions and when using data spans of limited duration. Thus, they provide the first demonstration of a full-ML search for CWs without prior input from a traditional first-stage search. However, the authors themselves consider this as only a 'first proof-of-principle' and point out several steps required for developing it into a mature analysis pipeline, including the use of data from multiple GW detectors and dealing with non-Gaussian real detector data, including the pervasive narrow spectral artifacts [177] that have long been a challenge to traditional CW search methods [191,192].
Besides CNNs and other neutral networks, several other non-traditional approaches to CW searches have been recently explored. One successful application is the hidden Markov model tracking CW search method [193,194,195,196], where the true intrinsic emission frequency of a GW source is treated as a hidden variable tracked by signal frequencies in the detector data, thus allowing for deviations from a simple signal model or even for intrinsic frequency fluctuations (e.g., spin wandering in binary sources and pulsar timing noise). The most likely time-frequency tracks are found with the Viterbi algorithm [197]. This method is different from most algorithms usually considered as ML, for example in that it does not require a separate training stage, but is nevertheless inspired by work in the computer and information science fields and provides another example for fruitful imports into GW science. An independent Viterbi-based search has also been developed in Ref. [198] Methods originally developed for CW searches have also proved to be useful for long-duration transient GWs [199,200]. After the detection of GW170817, several CW search methods have been adapted to search for transient GW signals from long-lived BNS merger remnants [199,201,196,202]. In this scenario, as for the shorter-duration post-merger signals briefly discussed above in Sec. 3, waveforms are not well known, and hence these traditional template-based search methods are limited in robustness to deviations from their assumed models. The authors of Ref. [203] have developed a CNN analysis for BNS post-merger signals lasting O(hours-days). They characterized the CNN approach, answering questions like: (i) how to train them, (ii) how many samples to train on, (iii) robustness to signals on which the networks were not trained, and (iv) the effect of different architectures on detection efficiency and false alarm probability.
They also applied their CNN in a real search on the week of LIGO data after GW170817, producing upper limits on possible GW emission similar to those in [199].
Other research areas where conventional CW methods reach their limits in terms of (i) robustness of the underlying signal model, (ii) computational efficiency, (iii) adaptability to new situations with reasonable human time investment, include very young sources with rapid frequency evolution (similar to the BNS post-merger case), glitching pulsars [200], and GW emission from neutron stars in binaries [e.g. 204,205,206]. Deep learning and other ML methods, or simple yet innovative methods imported from the computational science community, like the Viterbi algorithm [197,193,194,195,196,198], can lead to big steps towards first detections of these elusive GW signal types.

Stochastic background
A stochastic GW background may consist of many CBC sources that are too distant to be individually resolved or remnant GWs from the Big Bang [207]. Typical searches for a Gaussian stochastic background apply the cross-correlation method described in Ref. [22]. For non-Gaussian stochastic backgrounds from stellar mass BBHs, a Bayesian nested sampling method can be implemented for optimal search sensitivity [208]. In Sect. 5 we describe how ML can be used to improve the speed and accuracy of this method. A Gaussian mixture model is used to predict the discovery time of the GW stochastic background in Ref. [208].

Parameter estimation
To understand the astrophysics behind sources of GWs, it is essential to accurately measure the parameters of the source. For CBC signals, this is currently achieved using the LALInference [87], Bilby [88], or RIFT [209] tools designed for Bayesian parameter estimation and model selection. A Bayesian framework allows us to calculate posterior probability density functions for the parameters of GW signals. It also allows us to calculate the evidence for different models which can be used for model selection. Bayesian evidence is computationally costly. In the case of CBC signals, this is due to the high number of signal parameters (∼ 15), the process of generating waveforms, and the SNR and length of the GW data being analyzed. In LALInference, the computational issues are addressed using either nested sampling [210] or Markov Chain Monte Carlo (MCMC) techniques [211].
To estimate a posterior distribution, MCMC techniques work by stochastically wandering though a parameter space, distributing samples that are proportional to the density of the posterior. The LALInference implementation of MCMC uses the Metropolis-Hastings algorithm that requires a proposal density function to generate a new sample that can only depend on the current sample. The efficiency depends on the choice of the proposal density function.
Nested sampling is used to calculate the Bayesian evidence and can also produce posterior distributions for the signal parameters. Nested sampling transforms the multidimensional evidence integral into a one dimensional integral over the prior volume. First a set of live points are distributed over the entire prior. The point with the lowest likelihood is then removed and replaced with a point with a higher likelihood and continues until some stopping condition is reached. Posterior samples can then be produced by re-sampling the chain of removed points and current live points according to their posterior probabilities. This method can take days to weeks to measure the parameters of a GW signal. Speeding up this process becomes more important as GW detectors become more sensitive and more signals need to be analysed.
Some efforts at speeding up GW parameter estimation with ML have already been implemented in LIGO and Virgo data analysis codes. RIFT is a generic algorithm that uses Gaussian process or random forest regression to approximate marginal likelihoods appearing in Bayesian inference for compact binary parameters. By efficiently parallelizing, RIFT can produce inferences at a much faster speed [212]. Multinest [213] is a generic algorithm that implements the nested sampling technique using ellipsoids. Bambi [37] incorporates the nested sampling in Multinest and combines it with a neural network to learn the likelihood function on the fly. It has been shown that Bambi can produce results comparable to the full nested sampling techniques at a much faster speed.
Other efforts to perform parameter estimation directly with neural networks have been carried out in [38,36,39,40]. In Refs. [39,40], it was first shown that a neural network could reproduce GW Bayesian posteriors with both studies being carried out using independent methods. In Ref. [39], a conditional variational autoencoder (CVAE) is used in order to approximate the Bayesian posterior given a GW time series. The neural network used in this work is derived by starting from the assumption that we would like to minimize the cross-entropy between the true Bayesian posterior and an approximate Bayesian posterior produced by the neural network. After some further derivations, one arrives at a network configuration which, when given a GW time series after training, will reproduce the true n-dimensional Bayesian posterior in less than a few milliseconds. Five compact BBH parameters are inferred (m 1 ,m 2 ,d l ,t 0 ,φ) where phase (φ) has been internally marginalized out and all remaining parameters are fixed. Other work has also been carried out in parallel by [40] using a multivariate Gaussian posterior model to produce 1-and 2-dimensional posteriors. This approach also utilizes reduced order modeling [214] in order to simplify the input GW time series to the network. Once trained on a generated set of waveforms with many different noise realizations, it is then easy to sample from the model with comparable latency to [39] (∼ 1 − 2 milliseconds).
In addition, Ref. [215] uses a different method, known as normalizing flows, to produce GW posteriors comparable to those in Ref. [38,40] in less than 2 seconds. A normalizing flow is a series of invertible transformations that can be used to transform a simple initial distribution (in this case, a multivariate Gaussian) into a more complex target distribution. Normalizing flows are usually realized as neural networks, meaning that the parameters of the transformations are learned. The authors then compare three different approaches: a conditional variational autoencoder (similar to the network used in [38]), a masked autoregressive flow (MAF) [216], and a CVAE with autoregressive flows. The results from both the single MAF and the CVAE networks are comparable, while the combined CVAE and MAF model produce the optimal result.
All of the current GW ML parameter estimation studies are still at the proofof-principle stage, but they show that ML will be a promising tool for future GW parameter estimation. As the detectors sensitivity improves, the number of detections will significantly increase, and one of the advantages in using these ML techniques will be the speed in measuring the sources astrophysical parameters with respect to traditional methods, making it easier to process a large number of GW alerts.

Low-latency source properties inference
The source property inference of CBC events is one of the three astrophysical data products that the LIGO and Virgo collaborations provide to the external community, the other two being 3D-sky localization [217] and source classification [218]. The sourceproperty inference, also known as EM-Bright, consists of two statistical metrics: (1) the probability that the CBC system contains a neutron star of mass less than 3.0 M , P(HasNS), and (2) the probability that the final coalesced object is surrounded by tidally-disrupted matter after the merger, P(HasRemnant). Numerical solutions of the Einstein equations in the presence of matter provide us with estimates of the amount of tidally-disrupted matter. Several fitting formulae have been derived to compute this quantity in low-latency [219,220].
In an ideal situation, Bayesian parameter estimation of GW data can provide the posterior probability distribution of the various source parameters. Cuts based on the maximum neutron star mass can then be applied to compute P(HasNS). Similarly, the fitting formula for the tidally-disrupted matter can be applied to the parameter posterior distributions to infer P(HasRemnant). However, the fastest LIGO-Virgo parameter estimation infrastructures currently available do not meet the low-latency requirements for electromagnetic (EM) follow-up in X-ray and optical wavelengths. Getting the most reliable EM-Bright values in absence of Bayesian posterior samples poses the most important challenge in source properties inference.
Matched filter searches give point estimates of masses and spins that can be used for low-latency estimates of EM-bright properties. Individual mass and spin components from matched filter searches have often large errors with respect to true parameter values. During the O2 run, LIGO and Virgo researchers estimated these errors by implementing a Fisher approximation method to construct an ellipsoidal region of the parameter space around the matched filter point estimates. However, this method is only effective for high SNR signals and it ignores the matched filter search biases in parameter measurements. A supervised ML algorithm was developed to address this issue in Ref. [221]. The algorithm uses the scikit-learn [222] KNeighborsClassifier method to classify a source as HasNS or HasRemnant. An input data set is created by injecting simulated signals into GW data and performing a search. A map between true values and matched filter search point estimates is then used to train the classifier. Source property inference of GW detection candidates obtained with this method were provided in low-latency to the astronomy community during O3.

Rates and populations of gravitational-wave sources
LIGO and Virgo have already detected a small population of CBC sources [5]. The number of detections is expanding rapidly as the detectors become more sensitive, and within the next few years the population size will reach into the hundreds, allowing us to perform detailed CBC population studies. In the case of compact binaries, measuring properties such as their mass and spin distributions could allow us to determine their formation mechanism [223,224,225,226,227,228,229].
Several studies have demonstrated how ML can be applied for population analysis once a large enough population of CBC sources has been detected. A few of these studies apply unmodeled clustering techniques, such as Gaussian Mixture Models, to determine if GW detections come from multiple CBC populations [230,231,232]. They perform the clustering on the mass and spin measurements of the individual CBC detections. This method is well-suited as it does not require any prior knowledge of the expected CBC populations. This allows us to determine how many different populations of sources are present in detections made by LIGO and Virgo, and what fraction of detected events belong to each population. The masses and spin distributions of each population can then inform us of the differences in formation of the different source populations.
One of the current standard methods for population analysis in GWs is Bayesian hierarchical modeling. This approach is effective when there are trusted known population models, and allows for mixing ratios of different populations. However, this method can be computationally expensive. In [233], the authors combine hierarchical Bayesian modeling with a flow-based deep generative network. Combining ML with Bayesian hierarchical modeling allows for a population analysis that is too computationally complex for hierarchical modeling alone.
These modeled and unmodeled ML techniques will be applied to the GW populations detected in the future GW detector observing runs.

Identification of electromagnetic counterparts to gravitational-wave sources
Detecting EM emission from a GW source can considerably increase the knowledge of the astrophysical source properties and contribute to the validation of cosmological models. The authors in [234] applied an Artificial Neural Network (ANN) to localize GW signals from BBH mergers. The input data to the ANN consists of the arrival time delays, amplitude ratios, and phase differences from multiple GW detectors. For the purpose of GW source localization, they divided the sky into grids or sectors. The samples were labelled with the sector number to which they belonged and the ANN was trained to classify them into their correct sector. It was found, when testing their method with simulated BBH signals in Gaussian Advanced LIGO noise, that for coarse angular resolution (18-128 sectors), the model is able to classify unseen GW samples into their correct sector in more than 90% of cases, when a multi-labelling scheme is applied. For finer angular resolution (1024-8192 sectors), the exact classification accuracy drops. For these cases, the probability distribution of the sectors, obtained for each GW sample, was used as a ranking statistic to calculate the areas of 90% probability contours. This method is potentially orders of magnitude faster than traditional sky location methods, taking around 18 ms to localize a single GW sample. Currently, work is being done to extend this method for real noise and with BNS and NSBH sources.
Once the sky map for a GW event is produced, it is sent out to the EM community. However, the optical and near-infrared transient contamination rate can be prohibitively large. ML techniques are a promising method for optimising, automating and significantly reducing the number of optical and near-infrared transients that must be manually vetted. The expected EM emission of a BNS merger is a shortduration Gamma-Ray Burst (GRB) which is quickly followed by an optical afterglow and an optical and near-infrared kilonova [235]. While the prompt GRB shows highly collimated emission along the line of sight of a jet, the kilonova emission is isotropic. Even if the prompt emission of the GRB is off-axis with respect to observer's line of sight, the afterglow and the kilonova can be observed at later times, albeit at much fainter magnitudes. The contamination rate of optical transients in kilonova searches is estimated to be 1.79 deg −2 down to a limiting magnitude of m i =23.5 and m z =22.4 in the i-and z-band, respectively [236]. Extrapolating out to aLIGO's design detector horizon of D L = 200 Mpc, and effectively increasing the sensitivity and limiting magnitude which are required to observe kilonovae at these distances, this rate is expected to increase by a factor of 4 [236]. Types of astrophysical contaminants that may occur in the same time windows are Type Ia and Type II supernovae, flaring objects such as M-dwarf flares, and nuclear activity in active galactic nuclei (AGN). While these objects typically evolve on time scales which are different than the time scales of GRB afterglows and kilonovae, they may appear in single-subtracted images as possible candidates. Follow-up and vetting of these contaminants is a time-consuming process.
Besides the optical astrophysical contamination, a large number of optical image artifacts occurs as a direct result of image processing. Difference imaging, or image subtraction, is the main technique used for the identification of novel transients. Image subtraction is the process of subtracting a new image against an older reference image, removing consistent steady-state brightness sources and ideally leaving behind only novel transients.
Because of the high computational cost of image subtraction, many ML techniques have been employed to improve the rate of image processing contamination. While there are extensive studies in literature for transient identification with ML, one method directly addresses the reduction of image artifacts in the searches for the EM counterparts of GW signals over their hundreds to thousands of square degree error region skymaps [237]. As true transients should appear point-like in the subtracted image, the Point-Spread Function (PSF) of all objects in the subtracted image can be compared directly against point-like PSFs from the convolved reference image. The PSF of the reference image is derived by decomposing each source in the image onto a set of Zernike polynomials and summing the median of the distribution of coefficients for each order. Every source in the subtracted image is decomposed in the same way and the coefficient for each order is compared against the reference PSF, generating a value for how distant the reference PSF and the subtracted source PSF are in statistical space, called the Zernike distance. By using shape information alone and employing unsupervised methods, the method returns an efficiency of 91.5% and 92.8% on sets of artificial source injection tests with archival images from the Dark Energy Camera [238] and the Palomar Transient Facility [239,240], respectively. The reduction in the number of optical image artifacts is over 99.97% with over 91% of the simulated true signals being preserved. This is a reduction from hundreds to thousands of artifacts per image to single or double digit objects, making manual vetting of optical candidates feasible over the hundreds of sq. degree error regions.

Conclusions
In this paper, we have provided a review of ML applications to GW science. ML is an exciting area of development in the field of multi-messenger astrophysics. Over the years, LIGO and Virgo researchers have applied a variety of ML algorithms to many challenging problems in GW data analysis and detector characterization.
The data from LIGO and Virgo is non-stationary and non-Gaussian. ML methods can be successfully used to improve the quality of these data. The probability of a GW signal candidate being astrophysical or transient detector noise can be determined by applying ML to the data from thousands of environmental and instrumental monitors. The origin of detector noise can be inferred by applying classification techniques to find different types of transient noise. Citizen science projects provide training data for some of these studies.
Matched filter searches and Bayesian parameter estimation require a priori knowledge of the theoretical templates of the expected GW signals. Providing these signal predictions with full numerical relativity is computationally expensive. ML can be used to predict the GW waveforms in areas of the signal parameter space not covered by full numerical relativity. Searches based on ML techniques have comparable search sensitivity to matched filter searches. They are promising tools for future GW searches.
ML methods can also be successfully applied in GW searches where the exact signal morphology is unknown. Various ML algorithms have been developed to increase the search sensitivity of the standard LIGO-Virgo searches for GW bursts, as well as searches for GWs from CCSN and longer duration continuous GW signals.
After a GW signal has been detected, LIGO and Virgo run computationally expensive parameter estimation codes to determine the characteristics of the source. We have reviewed several studies showing that ML algorithms can significantly speed up parameter estimation of GW signals. After multiple detections are made, ML can be applied to determine the populations of GW sources and their properties, which will inform us of their formation mechanisms. Finally, ML can aid in finding EM counterparts to GW signals.
Considering the future improvements in the sensitivity of GW detectors, and their ability to detect many events per week, ML techniques are poised to become essential tools in GW science and multi-messenger astrophysics.