Search for black hole hyperbolic encounters with gravitational wave detectors

In recent years, the proposal that there is a large population of primordial black holes living in dense clusters has been gaining popularity. One natural consequence of these dense clusters will be that the black holes inside will gravitationally scatter off each other in hyperbolic encounters, emitting gravitational waves that can be observed by current detectors. In this paper we will derive how to compute the gravitational waves emitted by black holes in hyperbolic orbits, taking into account up to leading order spin effects. We will then study the signal these waves leave in the network of gravitational wave detectors currently on Earth. Using the properties of the signal, we will detail the data processing techniques that can be used to make it stand above the detector noise. Finally, we will look for these signals from hyperbolic encounters in the publicly available LIGO-Virgo data. For this purpose we will develop a two step trigger. The first step of the trigger will be based on looking for correlations between detectors in the time-frequency domain. The second step of the trigger will make use of a residual convolutional neural network, trained with the theoretical predictions for the signal, to look for hyperbolic encounters. With this trigger we find 8 hyperbolic encounter candidates in the 15.3 days of public data analyzed. Some of these candidates are promising, but the total number of candidates found is consistent with the number of false alarms expected from our trigger.


I. INTRODUCTION
Ever since the first detection of gravitational waves in 2015 by the LIGO-Virgo collaboration [1] a new era of gravitational wave astronomy has begun. With the much improved sensitivity of the Advanced LIGO [2] and Advanced Virgo [3] detectors, the observation of gravitational wave events has become a common occurrence, with more than 50 black hole mergers confirmed [4], as well as the detection of a binary neutron star inspiral [5] whose electromagnetic counterpart was also observed, opening the era of Multimessenger Astronomy [6].
Astrophysical models have problems explaining the high rate of black hole mergers detected as well as some of their measured parameters. Observed black holes have a spin much smaller than what is expected if they were formed from the collapse of stars [7] and some of them have masses impossible to generate in this way. Stellar collapse models predict black hole masses to be between 5 and 50 solar masses [8], but there is experimental evidence in favor of black holes with masses above [9] and below this range [10].
These observations point at a new population of black holes that do not have a stellar origin. As a consequence, there has recently been renewed interest in primordial black holes (PBHs) [11], which are black holes formed shortly after the Big Bang by the gravitational collapse of the primordial density fluctuations [12].
Even though there are some constraints on the primordial black hole population coming mainly from microlensing experiments as well as dwarf galaxy stellar dynamics and the cosmic microwave background [11], there is still enough freedom for a black hole population in the 0.1M to 100M mass range to comprise up to all the dark matter in the universe. This is the mass range which current gravitational wave detectors are sensitive to and could thus explain their observed anomalies.
A primordial black hole population in this 0.1−100M mass range also has a strong theoretical motivation [12]. A few fractions of a second after the Big bang, at a temperature of Λ QCD ∼ 200MeV, the QCD transition took place and quarks and gluons went from being free to being confined in hadrons. The sudden drop in the speed of sound due to the creation of non relativistic hadrons from relativistic quarks and gluons can lead to the collapse of high density regions, generating a large amount of primordial black holes. The natural value of the mass of the black holes formed in this way lies in the aforementioned 0.1 − 100M mass range [12]. Moreover, the spin of the black holes generated in this way is expected to be small, in accordance with the LIGO-Virgo observations.
In addition to the experimental evidence for this population coming from gravitational waves, most relevant for this paper, one should also note other hints that point to the existence of these primordial black holes [13] such as the microlensing of stars in M31 and distant quasars, the dynamics and star clusters of ultra-faint-dwarf-galaxies, the core galaxy profiles from primordial black hole scattering, the correlations between X-ray and infrared cosmic backgrounds and the Chandra Deep Field South.
The spatial distribution of the black holes will be highly dependent on the primordial power spectrum from inflation. In general, due to the nature of the primordial perturbations it will be natural for the black holes to form clusters [14]. This clustering will in addition relax the constraints on primordial black holes coming from microlensing [15], which usually assume a homogeneous distribution of black holes.
Simulations of these clusters [16] show that not only are black hole binaries formed, but there will also be a large amount of "hyperbolic encounters" in which two black holes scatter off each other with emission of gravitational wave bremsstrahlung. If in the hyperbolic encounter the two black holes pass sufficiently near each other, the system will emit a large amount of gravitational waves that will be possible to detect in current and future gravitational wave detectors [17][18][19]. Finding these hyperbolic events would be a strong evidence for clustered primordial black holes and could be used to study their mass and spin [20]. Moreover, dense stellar clusters may also accommodate collisions among neutron stars with impact parameters significantly larger than their physical sizes (to prevent tidal disruptions), as well as BH collisions, and could thus emit GW in the LIGO band. In order to see these events, those clusters should be sufficiently near us to be detectable, but there are plenty of these in the halo of our galaxy or nearby galaxies. The interpretation of the event GW190521 as a dynamical capture of two black holes [21,22] might already be a hint of dense clustering of black holes in our universe.
To our knowledge, no systematic search for close hyperbolic encounters has been carried out in the LIGO-Virgo data. The objective of this paper is to look for this type of events in the available open data [23].
To this end, in Sec. II we will develop 1.5 post newtonian (PN) accurate templates for the gravitational waves emitted by spinning compact binaries in hyperbolic orbits and we will numerically implement them in an efficient way using Python. In Sec. III we will study how these gravitational waves leave a signal in the network of gravitational wave detectors currently present on Earth. Since we are interested in studying the signal in real detectors, dominated by noise, we will see how to use signal processing techniques such as filtering, whitening and Q transforming to make the signal of these hyperbolic encounters stand out over the noise. Having determined the signal we are looking for, in Sec. IV we will develop a two level trigger to extract the possible hyperbolic candidates from the data. Similarly as in standard burst searches [24], the first level of the trigger will be a loose trigger based on correlations between detectors and tuned to select possible close hyperbolic encounter event candidates while doing a large reduction of the data. The second level of the trigger will consist of a convolutional neural network (CNN) [25,26] to classify the events of the first level trigger into either noise or close hyperbolic encounters (CHEs). This neural network will be trained using simulations of the close hyperbolic encounter signals as well as correlation triggers of noise.

II. DEVELOPMENT OF TEMPLATES FOR CLOSE HYPERBOLIC ENCOUNTERS
To compute the gravitational waves emitted by the scattering of two black holes in a close hyperbolic encounter, we will first need to determine the orbit that these black holes follow. Using this orbit we will then compute the gravitational wave strain emitted by the system. We will want to take into account up to the leading order effects of the spins of the black holes, since the spin is a critical quantity to distinguish between astrophysical and primordial black holes.
A. 1.5PN accurate hyperbolic orbit for spinning compact binaries The formulation of the problem we want to solve is very simple, we want to study the scattering of two gravitationally interacting masses m 1 and m 2 with spins S 1 and S 2 respectively. Nonetheless, this problem does not have a closed analytical solution in general relativity (GR), and numerically solving the full set of Einstein equations is in general not feasible, since the computational cost is prohibitive. Because of this, the problem will be studied using a perturbative expansion of general relativity in powers of 1 c 2 known as the post newtonian (PN) approximation.
The total mass of the binary is m = m 1 + m 2 , while the reduced mass is µ = m 1 m 2 /m and the symmetric mass ratio is η = µ/m. To find the dynamics of the system we will use the Hamiltonian formulation of general relativity. We are interested in studying the two body scattering problem taking into account up to leading order spin-orbit coupling, which is of order 1.5 PN. Because of this we will have to consider the following reduced hamiltonian (H/µ): H( r, p, S 1 , S 2 ) = H N ( r, p) + H 1PN ( r, p) where we have set r = R/(Gm) and p = P/µ, being R the relative separation vector between the black holes and P its conjugate momentum. S 1 and S 2 are the reduced spin vectors, that is, S 1 = S 1 /(µGm) and S 2 = S 2 /(µGm). In the language of Poisson brackets {·, ·}, these variables satisfy: with all other brackets between the variables being zero. The Hamiltonians appearing in Eq. (1) can be derived directly from Einstein's theory of general relativity. The result can be found in the literature [27] and has the following form: H 1PN ( r, p) = 1 c 2 wheren = r/r and the effective spin S eff is a quantity of order O(1/c) defined as a combination of the spins of both objects given by: If we label the black holes such that m 1 ≥ m 2 , then δ 1 and δ 2 are given by: To find the equations of motion we will use the Poisson brackets. Instead of finding an equation of motion for the momentum p, it will be more convenient to substitute this equation for an equation of motion for the angular momentum vector L = r × p. If we omit terms smaller than O(1/c 3 ), the equations of motion are: In addition to these equations of motion, we will have constants of motion that are conserved in time. The reduced energy E = H is conserved because ∂ t H = 0. Multiplying Eq. (6a) by L we immediately get that L = | L| is conserved. Doing the same multiplication by L to Eqs. (6b) and (6c), we get that S 1 = | S 1 | and S 2 = | S 2 | are also conserved. It can also be shown by direct com-putation that the magnitude L · S eff is conserved: Finally, we observe that if we add Eqs. (6a), (6b) and (6c) we get that: And thus the total reduced angular momentum J, defined as J = L + S 1 + S 2 , is conserved in both magnitude and direction. This allows us to introduce a set of Cartesian coordinates (x, y, z) which generate the triad (ê x ,ê y ,ê z ) such that at all times: We will also define a set of spherical coordinates (r, θ, φ) which generate the triad (n,ê θ ,ê φ ). Using these coordinates we can write: (10c)

Radial motion
Using Eqs. (10a) and (10c) we have that L 2 = | r× p| 2 = r 2 (p 2 θ + p 2 φ ) and thus: Using the Newtonian Hamiltonian of Eq. (3a) we have: We can substitute Eqs. (11) and (12) in the 1.5PN Hamiltonian of Eq. (1) and get a 1.5PN accurate expression for the energy: Projecting the equation of motion for r of Eq. (6d) in the radial direction, we obtain: And substituting Eq. (11) into this equation of motion we have: Finally, substituting Eqs. (11) and (12) in Eq. (15), then using the result in Eq. (13), solving forṙ 2 and omitting terms O(1/c 4 ) or smaller, we get: where A, B, C and D are defined only in terms of the constants of motion and are given by the following expressions: Since D is of order O(1/c 2 ), we can define a new variable r with the following equation: And then, omitting terms of order O(1/c 4 ), we can write Eq. (16) as:ṙ with: Eq. (19) is completely analogous to the expression one gets for Kepler's problem in Newtonian dynamics, and thus it accepts hyperbolic solutions given in the following parametric way: Substituting this in Eq. (19) and imposing the equality of both sides, one gets the following orbital parameters: We can now find a parametric expression for r undoing the transformation of Eq. (18): where the new parameters are given by: And thus we have been able to write the radial motion in a parametric manner only in terms of the constants of motion. Nonetheless, the constants of motion E and L will not be the most convenient to express our results. It will be better to express them in terms of orbital parameters, namely the eccentricity e t and the mean motion ξ and defining a new constant Σ to quantify the effect of the spin. That is: The remaining orbital parameters can be expressed in terms of these new variables in the following way: Substituting this in the expressions of Eq. (23) for the orbit, we finally get: whereṙ is computed using the fact that dr dt = dr dv / dt dv .

Angular motion
If we try to naively solve the equations of motion for the angular part of the two body system using Eq. (6d), we will have that: These equations are not in an optimal form since they explicitly depend on p θ and p φ and they are coupled to each other and to the equation of motion for S eff . Part of the difficulty lies in the fact that in our reference frame, the orbital angular momentum L = r × p precesses, and thus, the plane of the orbit is not constant. This can be solved going to a new non-inertial coordinate system in which L = Lk at all times. One of the unit vectors of the new coordinate system will then bek. Since r is perpendicular tok = r× p L , then the radial direction n = r/r can be used as another unit vector of the new basis, which will allow us to use the radial solution found in Sec. II A 1. With these two unit vectors chosen, the new triad is fully specified: The transformation between our previous triad (ê x ,ê y ,ê z ) and this new one can be written in terms of three Euler angles Φ, α and ι in the following manner [28]:n = (cos α cos Φ − cos ι sin α sin Φ)ê x + (sin α cos Φ + cos ι cos α sin Φ)ê y + (sin ι sin Φ)ê z , (30a) k = sin α sin ιê x − cos α sin ιê y + cos ιê z .
In these coordinates we have that the relative position is given by: where we can see that the radial part is left unchanged and thus the results of Sec. II A 1 for r andṙ are still valid. In addition, we can use the equation of motion Eq. (6d) for r to compute: where we have used that p · L = p · ( r × p) = 0. From Eq. (32) we can deduce thatα andι are both of order O(1/c 3 ). The equation of motion in theξ direction is: Using that p · ( L × r) = L · ( r × p) = L 2 andα is O(1/c 3 ), we can solve Eq. (33) forΦ: With all of these results we can compute the modulus squared of˙ r given in Eq. (31b), ignoring O(1/c 4 ) terms we have that: The modulus squared of˙ r can also be computed using the equations of motion (ignoring terms O(1/c 4 )): |˙ r| 2 = { r, H} · { r, H} = p 2 + 2 L · S eff c 2 r 3 where we have used that p · ( S eff × r) = S eff · ( r × p) = L · S eff . Substituting the expression for p 2 found in Eq. (15) into the first term of Eq. (36) we have that: And finally using expressions of Eqs. (11) and (12) to express p in the O(1/c 2 ) term of Eq. (37) in terms of constants of motion we get: Since we have now two expressions for |˙ r| 2 in Eqs. (35) and (38), we can equate them and solve forΦ: And taking the square root of this expression ignoring O(1/c 4 ) terms, we finally get: If we rewrite the constants of motion in terms of the parameters of the orbit defined in Eq. (25) we get: where we can now substitute our expression of Eq. (27a) for r obtained when solving the radial motion: An important parameter when talking about hyperbolic encounters is the impact parameter, which is defined as usual: Using the expressions of Eq. (31) for r and˙ r, the impact parameter takes the following 1.5 PN accurate form: Finally, substituting Eq. (41) forΦ and Eq. (27b) foṙ r, and taking the limit r → ∞ and consequently v → ∞ we have: In addition, to get an idea of the validity of the approximation, it will also be convenient to compute the maximum velocity, which is reached at periastron. Computing this to leading order we obtain: However, the problem is not complete, to be able to fully characterize the orbit and to solve Eq. (42) forΦ, we need to know the evolution of the Euler angles α and ι. To do this we must find how the unit vectork = L/L given in Eq. (30c) precesses. This can be done using the equation of motion found in Eq. (6a): (47) And thus we see that we will need also the evolution of S 1 and S 2 given in Eqs. (6b) and (6c) respectively. We can simplify these equations somewhat using that the magnitude of the individual spins S 1 and S 2 is conserved, and thus we can write: And then the equations of motion that we need to solve will be: Rewriting L in terms of the parameters defined in Eq. (25) and r in terms of Eq. (27a) we have the fol-lowing equations we have to solve: (50c) These equations will have to be solved numerically and to integrate them it will be necessary to find the initial conditions. Assuming that the initial orientations of the spins are given by (θ i 1 , φ i 1 ) and (θ i 2 , φ i 2 ) forŝ 1 andŝ 2 respectively, we can use the conservation of the total angular momentum J = L + S 1 + S 2 = Jê z to find the initial conditions fork: Using the expressions of Eq. (48) for S 1 and S 2 and expressing L in terms of the parameters of Eq. (25), we can rewrite Eq. (51) in the following way: where k z is fully specified by the fact thatk is a unit vector (|k| = 1). Using these initial conditions to integrate the equations of motion we can extract the values of α,α and ι from Eq. (30c), obtaining: The values of these angles can be plugged into Eq. (42) forΦ and solve this equation numerically to get an expression for Φ using the initial condition of Φ| t=t0 = Φ 0 . This together with the expressions in Eq. (27) for the radial part fully specifies the orbit.

B. Gravitational wave emission
Now that we have obtained a method to compute the orbit followed by the black holes, we will want to determine the gravitational waves emitted by the system. In this section we will first derive the leading order gravitational wave emission to get a better grasp of the physical process as well as to introduce the variables and degrees of freedom involved. This leading order computation will also allow us to understand how to derive the more cumbersome expressions for the gravitational wave emission that take into account up to spin-orbit corrections. The full step by step derivation of these higher order expressions is out of the scope of this project and they will be taken from the literature, since they are the same formulas used in the compact binary coalescence templates [29], which are very well documented.
To compute the leading order gravitational wave emission we start from the quadrupolar formula derived by Einstein in 1918: where R is the distance from the encounter to our detector, t r is the retarded time (t r = t − R/c) and Q ij are the components of the quadrupole moment defined as: The 0-component of the stress energy tensor T 00 (t r , x) is given in our system by the following Newtonian accurate expression: where r(t) is the solution for the orbit found in the previous section. Eq. (56) can be substituted in Eq. (55) obtaining: To compute the gravitational waves using Eq. (54) we have to differentiate Q ij twice with respect to time: where we can introduce the Newtonian accurate expression forr i :r Substituting this in Eq. (58) we obtain that the second derivative of the quadrupole moment of the system is: Which when substituted in Einstein's quadrupolar formula of Eq. (54) gives: For the gravitational waves we will be interested in the transverse traceless part of this strain tensor. To compute it we will use the transverse traceless projection operator P ijkm (N ) which projects vectors into the plane orthogonal toN , whereN is the unit vector pointing from the encounter to the detector. This projector is defined as: where P ij = δ ij − N i N j . And thus, using this projector we have that: where we have used that h T T ij = h T T ij . To write down the two polarizations of the gravitational waves h + and h × , it will be convenient to define a new triad: whereê z is the unit vector pointing in the direction of the conserved total angular momentum J introduced in Eq. (9). In this new triad, the two polarizations will be given by: Substituting here Eq. (63) for h T T ij and expressing it in a vector form, we obtain the following expressions for the leading order gravitational waves emitted: To find expressions forp andq in the frame in which the orbits were solved, we note that if we define Θ as the angle formed betweenN andê z , then invoking the arbitrariness in the orientation of the triad (e x , e y , e z ), we can always write:N = sin Θê x + cos Θê z .
The gravitational waves of Eq. (66) emitted by the system will have a 2.5 PN order dissipating effect on the orbit that can be modeled using the following coupled differential equations [30,31]: where β = e t cosh v − 1.
The formulas for the gravitational wave strain h + , h × of Eq. (66) only take into account the leading order postnewtonian contribution, but since we have computed the orbit to order 1.5 PN, we can use this orbit to compute gravitational waves with higher order contributions. Taking into account up to leading order spin-orbit contributions to the gravitational wave emission follows a similar procedure to the one used to arrive at the leading order expression of Eq. (66), but taking into account higher order corrections of the quadrupole moment. These formulas are also used in the computation of gravitational waves from compact binary coalescence and the derivation of the contributions up to spin-orbit corrections can be seen in the literature [32]. The result one obtains is: It can be seen that the first terms of h + and h × in Eq. (71) coincide with the leading order expressions derived in Eq. (66). Subsequent terms correspond to the higher order contributions and will in general be smaller.

C. Numerical implementation
In this section we will gather all the formulas that we have been discussing, to show how the templates are actually computed. For the numerical implementation we create a Python code that takes the following input: where all parameters have been discussed in previous sections except t 0 and t f which are the initial and final times of the simulation, and χ 1 and χ 2 which are the Kerr parameters of the two black holes that take values between 0 and 1 and measure the magnitude of the component spins. They are defined in the following way: Using this input, the problem is completely specified and the linear system of differential equations that has to be solved is: To be able to evaluate the derivatives in Eq. (74), we need to compute the values of the parameter v at each time t. This is done solving the following transcendental equation: To obtain an accurate solution to this transcendental equation in an efficient way we use Mikkola's method [33], which was specifically designed to solve this common type of equation in orbital dynamics. Note that writing the relation between v and t in this manner fixes the periastron time at t = 0. The values of α,α, ι andι appearing in Eq. (74) and that are needed to fully characterize the orbit can be derived using the following formulas: To integrate the system of differential equations of Eq. (74) we will also need the initial conditions, which will be given by: From these initial conditions the value of the constant of motion Σ is also deduced: With the initial conditions and all the variables appearing in the differential equations of Eq. (74) now specified, we can numerically integrate them. To do this in an efficient and fast way, we use an explicit Runge-Kutta method of order 5(4) [34], where the error is controlled assuming accuracy of the fourth-order method, but steps are taken using the fifth-order accurate formula.
Having the solution for the differential equations of Eq. (74), we can compute r,˙ r,p,q andN given by: sin ι sin α sin Θ + cos ι cos Θ .
And these vectors, which are the solution of the orbit, can be substituted in Eq. (71) to obtain the gravitational waves emitted by the system taking into account up to spin-orbit corrections.
In Fig. 1 we can observe a representative example of the type of orbit r(t) one obtains when solving the equations of motion of Eq. (74). We have drawn arrows on the orbit representing the effective spin of the system S eff as We have superimposed on the orbit the effective spin of the system S eff , showing its direction and magnitude with an arrow. defined in Eq. (4). Even though the maximum velocity reached is quite high, at around 0.36c (the expansion parameter of the PN approximation will be v 2 c 2 ∼ 0.13), the orbit is still hyperbolic and the effective spin is almost constant, meaning the higher order post newtonian corrections are "small", as should happen for the expansion to be valid.
The hyperbolic event shown in Fig. 1 will emit gravitational waves, in accordance with Eq. (71). We show in Fig. 2 the strain from these gravitational waves for the two polarizations. This image is very representative of the shape of the gravitational wave strain we expect from close hyperbollic encounters. We can see that as the black holes get closer to each other, the absolute value of the strain increases for the two polarizations and it reaches a maximum before they get to the point of closest approach (at t = 0). Due to the quadrupole nature of the gravitational waves (the frequency of the waves is twice the orbital frequency), the absolute value of the strain falls down before the closest approach and changes sign. The absolute value reaches again a maximum after closest approach and then as the black holes drift further appart, the absolute value diminishes again until it reaches a constant value. All in all, the gravitational waves emitted during hyperbolic encounters perform only one oscillation.
Note that the asymptotic value of the strain after the encounter is different from the one before the encounter. This is known as the gravitational wave memory effect FIG. 2. Gravitational waves emitted by the system shown in Fig. 1 assuming it happens at a distance R = 20Mpc and with an inclination of the orbit Θ = 45 • . [35], and has been studied in the literature for hyperbolic encounters [31].

III. SIGNAL IN A GRAVITATIONAL WAVE DETECTOR
In this paper we are not only interested in the purely theoretical aspects of the gravitational wave emission of black holes in hyperbolic orbits, we also want to look for this type of events in current gravitational wave detectors. To this end, we will need to determine and characterize the signal that these gravitational waves leave in a detector.
Right now there are many observatories looking for gravitational waves via different experimental setups, using resonant mass antennas [36], pulsar timing arrays [37] and laser interferometers.
Laser interferometry is to date the only method that has been able to confidently observe gravitational waves, and it offers by far the best sensitivity to detect transient gravitational wave events. Because of this, we will focus our search to the data from laser interferometer gravitational wave observatories, in particular from Advanced LIGO and Advanced Virgo.

A. Laser interferometers for gravitational wave detection
The basic design of a laser interferometer for gravitational wave detection is to have a Michelson interferometer decoupled from outside forces. This is done using pendulum suspension for the mirrors and for the optical benches. The interferometer is tuned to be in destructive interference in the absence of gravitational waves. Whenever a gravitational wave passes through the de-tector, the strain will change the effective length of the arms, the interference will not be completely destructive anymore and a signal will be observed at the photodiode. This basic design is schematically shown in Fig. 3.
Of course to improve sensitivity the simple design of Fig. 3 gets more complicated [38]. All Earth-based laser interferometers looking for gravitational waves add a Fabry-Perot resonant cavity in each arm to build up the phase shift produced by the arm length change. Another standard feature added is power recycling, a technique to increase the effective power of the laser by adding power recycling mirrors to form a resonant cavity between the laser source and the Michelson.

Antenna patterns
Laser interferometers are only sensitive to the differential change in the length of their arms. Because of this, we will have to determine how the gravitational wave strain h + and h × will affect the length of the arms, and in this way go from the theoretical prediction obtained in Sec. II to a physical observable.
We will consider the case in which the wavelength of the gravitational wave is much larger than the size of the arms of the detector (L/λ gw 1). For detectors like LIGO and Virgo with an arm length of 3-4 km, this is satisfied for frequencies smaller than ∼ 10kHz. In this large wavelength approximation, we can ignore spatial variations of the metric inside the detector. Using the proper detector frame, the motion of the mirrors at the end of the interferometer arms will be governed by the geodesic deviation equation [39]: We will now need to define a coordinate system with which to carry our computation. We choose a coordinate system as shown in Fig. 4, where we have a detector reference frame (x, y, z) such that the arms of the interferometer are along the x and y axis, and we have a source reference frame (x , y , z ) such that the propagation direction of the gravitational wave coincides with the z axis. The transformation between these two coordinate systems can be given in terms of the usual Eulerian angles θ, φ and ψ. That is, the transformation from (x , y , z ) to (x, y, z) will be given by [40]: Since the value of the gravitational wave strain h ij is assumed to be very small (usually h ∼ 10 −21 ), then in the detector frame, the test mass at the end of the arm along the x-axis will be displaced from its equilibrium position by a small amount, that is ξ j = (L, 0, 0)+O(hL). Looking at how this test mass moves along the length of the arm, we have: We can get an equivalent expression for how the test mass in the y-axis moves in the y-direction: And the differential change in the length of the arms can be gotten subtracting Eq. (83) from Eq. (82): We can integrate this expression using that in the absence of gravitational waves the interferometer is tuned to have δL = 0. Keeping only linear terms in h we obtain: To connect h xx and h yy in the detector frame with h + and h × in the source frame in which the wave propagates in the z direction, we use that in this source frame the strain is given by: And since the gravitational waves are tensorial, they transform in the following way: Substituting this in the expression for the differential change in arm length of Eq. (85) we get: where h ≡ δL/L is called the strain amplitude and it is the actual quantity that can be measured by laser interferometers. F + (θ, φ, ψ) and F × (θ, φ, ψ) are the antenna patterns and they contain all the dependence with the angles that appears in the transformation from the source to the detector frames. They are given by: We show the antenna patterns for 0 polarization angle (ψ = 0) in Fig. 5. The antenna patterns for non 0 polarization angle can be gotten by combining these two with a rotation: (90) In Fig. 5 we can see how F + (θ, φ, 0) and F × (θ, φ, 0) vary depending on the incoming angle of the gravitational wave. At θ = π with φ = π 4 , 3π 4 , 5π 4 , 7π 4 we will have that the two antenna factors vanish, and since they vanish for ψ = 0, from Eq. (90) we have that they vanish for all polarizations. These four points are called the blind spots of the detector, since gravitational waves coming from positions in the sky close to them will be highly suppressed and almost impossible to detect.
We will be interested in the signal one gravitational wave will imprint in all the different detectors on Earth. These different detectors will be in different points of the globe, shown in Fig. 6, and placed with different orientations that vary over time due to the orbit and rotation of the Earth. We will locate a particular event in the sky using Earth's equatorial frame, shown in Fig. 6, in which the wave will have a right ascension α, declination δ and polarization angle ψ. We will have to relate these coordinates with the detector coordinates defined in Fig. 4, which will allow us to compute the antenna patterns and project the wave onto the detector. It will also be very important to take into account the fact that the wave will arrive at different times to each detector due to the distance between the detectors. This time difference is known as the light travel time between detectors and depending on the incoming direction of the wave it can be as long as 27ms between LIGO Hanford and Virgo, 26ms between LIGO Livingston and Virgo and 10ms between the two LIGO detectors.
In practice we will do the projection of the gravitational waves into the detector using lalsuite [42] which is a software developed by the LIGO collaboration that can be used for this purpose. It takes into account all the antenna pattern and light travel time effects using accurate positions and orientations of the detectors as well as small Doppler shift corrections due to the Earth's rotation and orbit.
To exemplify how the projected waves look, in Fig. 7 we show the result of projecting the gravitational waves of Fig. 2 into the gravitational wave detectors, assuming that they come from δ = 1.0 rad, α = 3.7 rad, with ψ = 0.2 rad and with the periastron time (t = 0 in the simulation) taking place at 17:29:18 UTC of 2017-08-19 at the center of the Earth. We observe that even though it is the same gravitational wave in all interferometers, because of the antenna factors, it will have a different amplitude and shape in each of them. In the specific case shown here, the biggest amplitude is recorded in Virgo.

Characterizing the stationary noise
From the interference pattern one observes in the photodiode of the interferometer, one can in principle reconstruct the value of δL/L and thus obtain the value of the strain amplitude h(t). In practice, the output of any real detector will also contain noise, so the reconstructed quantity s(t) will actually be given by: where h(t) will be the part of the detector output coming from the gravitational wave signal, while n(t) is the noise and it accounts for all the rest of the output coming in general from a variety of sources. The noise n(t) can usually be assumed to be stationary, this means that its properties do not vary in time. If the noise is stationary, the different Fourier components are uncorrelated, and thus: where ... denotes the ensemble average, that is, the average over many realizations of the same system. Eq. (92) defines the noise power spectrum distribution S n (f ). Since n(t) is real,ñ(−f ) =ñ * (f ) and therefore S n (−f ) = S n (f ). In addition, n(t) is dimensionless like the strain, and therefore S n (f ) has dimensions of Hz −1 .
Stationary noise will be fully characterized by S n (f ). To get an idea of the sensitivity of the detector, one defines the strain sensitivity as S n (f ), which gives an idea of the amplitude of the noise at a given frequency. Values of the strain smaller than S n (f ) at a given frequency will be difficult to measure. In Fig. 8 we have represented the design strain sensitivity for the advanced LIGO detector [2]. This is very representative of modern gravitational wave interferometric detectors. Below 20 Hz the sensitivity is very limited by the seismic noise as well as by the thermal noise in the suspension of the mirrors. Above 10 3 Hz the sensitivity starts to worsen again due to the quantum noise, which encompasses the effects of statistical fluctuations in detected photon arrival rate (shot noise) and radiation pressure due to photon number fluctuations.
The noise power spectrum distribution S n (f ) will be extensively used in the data analysis and therefore it will be extremely useful to compute it. In practice we only have one realization of the system and the amount of time we measure is finite and sampled in a discrete grid. Because of this, doing the ensemble average of Eq. (92) will not be a viable way to get S n (f ). We will estimate S n (f ) using Welch's method [43].
The basic idea of this method is to divide the signal into successive overlapping segments, computing the periodogram for each segment and averaging. Let x(n), n = 0, ..., N − 1 be the vector whose power spectra  [41]. Right: We show the Earth equatorial frame used to locate sources in the sky. (X,Y,Z) is the wave frame, δ is the declination, α is the right ascension and GHA is the greenwich hour angle. Image from Ref. [42].
we want to compute. We take windowed segments of this vector x k (j), possibly overlapping, of length L, and with the start of consecutive segments being D units apart.
That is: x m (n) = w(n)x(n + mD) n = 0, .., L − 1 , where w(n) is the window function. In our analysis we will always use the Hanning window to smooth discontinuities at the beginning and end of the sampled signal. It is given by: We define K as the total number of segments we have ((K − 1)D + L = N ), and thus m = 0, 1, ...K − 1. We compute the periodogram for each one of the segments: where FFT denotes the fast Fourier transform and U is a normalization that has to be introduced to take into account the effect of the window and it is given by: The Welch estimate for the power spectral density is then given by the average of the periodograms over all the segments, where we also have to take into account that the spacing between consecutive points of x(n) is ∆t and the fact that we defined S(f ) with a factor 1 2 in Eq. (92): which has the correct Hz −1 units. In general for the computation of the noise power spectrum distribution of a gravitational wave detector we will always use the detector output for the strain s(t), which in principle contains the signal h(t) and the noise n(t). This is justified because the data of current gravitational wave detectors are vastly dominated by noise, except when a gravitational wave event takes place.
When we have a gravitational wave event candidate in our data and we want to analyze it, to compute the power spectrum distribution we will exclude the windows that we suspect have gravitational wave signal, and we will just compute the power spectrum around them.
In Fig. 9 we show the strain sensitivity S(f ) obtained when applying Welch's method to 4096s of measured strain of the two LIGO detectors and the Virgo detector. We observe that even though the general shape of the strain sensitivity is very similar to the design one in Fig. 8, the experimental curve has a lot of spikes on top of the design curve. These spikes come from so called "technical" sources of noise, which comprises a large variety of environmental sources such as the noise coming from the pumps that keep the vacuum inside the interferometer arms, the coupling of electrical circuits, etc.
Note in Fig. 9 that the LIGO detectors are more sensitive than the Virgo detector, having about a factor of 3 better sensitivity along all the frequency spectrum. From the LIGO detectors, Livingston has better sensitivity than Hanford, specially at low frequencies (f 100 Hz), due its superior suspension system. At high frequencies (f 100 Hz) both LIGO interferometers have similar strain sensitivities.

B. Signal processing
As we have seen in Sec. III A 2, detector data will not only contain the gravitational wave signals we are looking for, but also noise. For current detectors such as LIGO-Virgo, we expect the amount of noise to be very significant when compared with the amplitude of gravitational waves. In this section we will explain the methods that will be used to reduce the impact of the noise and represent the data in such a way that the gravitational wave signals from close hyperbolic encounters are enhanced.
To exemplify the results of each step of the signal processing we will use the example gravitational wave from a hyperbolic encounter that was shown in Fig. 7, which we will inject into the experimental output of the detector. Since we are assuming that the response of the detector is linear, injecting the gravitational wave will equate to adding it to the experimental strain (s(t) = s exp (t) + h(t)). The result of doing this is shown in Fig. 10, where we have also plotted the gravitational wave signal h(t), whose amplitude is so much smaller than the noise that its features can not be seen.

Filtering
The gravitational wave data we are using is sampled at a frequency of 4096Hz. Because of this, we can explore Fourier modes below the Nyquist frequency of 2048Hz. Nonetheless, not all the frequencies will give us useful information, and therefore we will filter the ones that are dominated by noise.
In the strain sensitivity of Fig. 9 we observe how at frequencies below 20Hz the noise greatly increases, mainly because of the seismic and thermal noises. Because of this we will apply a 20Hz high-pass filter to the data to remove frequencies below 20Hz, which will always be completly dominated by noise.
Because gravitational waves from hyperbolic encounters only perform one oscillation, the characteristic frequency of the waves will be closely related with the length of this oscillation (f c ∼ 1/∆t) and since this single oscillation will be well located in time, due to the Fourier uncertainty principle, it will have a high frequency spread. Since we are using the post Newtonian approximation, the characteristic length scale of the problem will have to be bigger than tens of times the Schwarzschild radius and the characteristic speed of the black holes will have to be much smaller than the speed of light. All in all, since ∆t ∼ ∆x/v, the largest frequency we can probe in the limit of validity of our post Newtonian approximation can be estimated to be around f c ∼ 1/∆t ∼ c 100Rs , which for typical black holes of 5M is f c ∼ 200Hz, but generally it will be lower.
Since high frequencies will not have gravitational wave signal from hyperbolic encounters and will thus be dominated by noise, we will ignore them by applying a 800Hz low-pass filter. Together with the 20Hz high-pass filter, this will equate to a 20-800Hz band-pass filter.
In Fig. 11 we show the Fourier structure of the gravitational wave of Fig. 7 that we have been using as an example, together with the 20-800Hz window we are using to band-pass filter the data. This simulated gravitational wave came from black holes of 20M and 15M and we can see how the signal becomes negligible at high values of the strain, well below the 800Hz value.
In Fig. 12 we show the result of applying the 20-800Hz band-pass to s(t) and h(t) of Fig. 10. Now the signal can be seen much better in the strain of the LIGO detectors, since we have removed the frequencies that are the source of the most noise. Nonetheless if we did not have the actual injected signal in Hanford to guide the eye, it would not be so easy to see its presence in this detector, since Hanford still has large fluctuations of noise. Lastly, we observe that Virgo is still dominated by noise

Whitening
After filtering away the very low and very high frequencies, which are completely dominated by noise, we note that the remaining noise power spectrum at the laser interferometer depends strongly on frequency (see Fig. 9), because of this we say that the noise is "colored".
At the frequencies at which the noise power spectrum is bigger, large strain values will be less significant. Because of this, we will want to weigh down the strain at the parts of the frequency range with the higher values of the noise power spectrum. We will do this in such a way that the transformed strain would have a flat power spectrum if it contained only noise. This will be the whitened strain. The name comes from the fact that noise with a flat power spectrum is said to be "white".
Since in Eq. (92) the power spectrum is defined proportional to the average of the modulus squared of the Fourier transform, the whitened strain will just be: where to get the whitened strain in the time domain we just compute the inverse Fourier transform of Eq. (98). The result of doing this whitening to the filtered signals of Fig. 12 is shown in Fig. 13. We can see that whitening the strain further enhances the signal with respect to the noise. Now the gravitational wave event can be clearly seen in the two LIGO detectors, even if we did not have the injected event to guide the eye. The situation in Virgo is also much more improved, since we have suppressed the frequencies that contained the most noise.
Since gravitational waves from close hyperbolic encounters only perform one oscillation, they will not only have a characteristic spread in time, but they will also have a large characteristic spread in frequencies, as shown in Fig. 11. Because of this it will be interesting to study the signals in the time-frequency domain.
To convert our whitened data into the time-frequency domain we will use the Q transform [44]. The Q transform can be seen as a series of filters centered at different frequencies f , each having a bandwidth δf , such that the quality factors of the filters has a constant value Q [45]: In practice, the way it will be computed will be by taking the Fourier transform of the whitened strains(φ), applying a window centered at frequency f with characteristic size δf = Q/f and then computing the inverse Fourier transform, to obtain s(t, f ). That is: where one possible convention for the window function [46], that will be used during all this analysis is the bisquare window: FIG. 11. Fourier transform of the gravitational wave shown in Fig. 7. We have marked with black lines the 20-800Hz frequency window we are using to band-pass the strain.
which satisfies the requirement of having δf = f /Q and is normalized. For our purposes we will not be interested in the phase of the Q transformed strain, we will be mainly interested in the energy deposited at each point of the time-frequency domain, which will be defined as the modulus squared of Eq. (100). In addition, we will want to normalize this energy so that the noise floor is set at unity. This is done by dividing by the median value of the energy. The median value is well suited for this normalization because it will correctly estimate the noise floor without being very affected by possible large fluctuations in the Q transform due to large signals being present. We then have that the normalized energy in the time frequency-domain will be: In Fig. 14 we show the normalized energy computed from the whitened strain of our example shown in Fig. 13. The Q transform of this example, as well as in the rest of the analysis, will be computed using a quality factor of Q = 8 which was found to be the typical value that maximized the maximum of the normalized energy of Eq. (102) when injecting close hyperbolic encounters within the parameter space studied (see appendix B) into simulated Gaussian noise from advanced LIGO at design sensitivity [2].
In this representation, the gravitational wave event clearly stands out above the noise in all three interferometers, and we can see its structure in the time-frequency domain. In general, gravitational waves from close hyperbolic encounters will look similar to the ones shown in Fig. 14, the normalized energy is symmetric in time and at low frequencies is suppressed because of the large value of the power spectrum distribution by which we have divided when we whitened it. At large frequencies the normalized energy decays because as we saw in Sec. III B 1, hyperbolic events are heavily skewed towards small frequencies. The final resulting shape of hyperbolic events is teardrop like, with the actual shape of the teardrop depending on the parameters of the close hyperbolic encounter as well as the detector sensitivity at that moment.

Signal to noise ratio
To characterize how loud a signal is with respect to the noise, the most commonly used metric in gravitational wave astronomy is the signal to noise ratio [39]. It is computed by looking at how much signal there is at each frequency and weighting it by the noise power spectrum. That is: whereh(f ) is the Fourier transform of the gravitational wave signal projected into the detector and f min and f max are the lower and upper limits of the frequency range under study (in our case they would be 20Hz and 800Hz respectively).
With this definition, the signal to noise ratio of the example event that we have been analyzing throughout this section, shown in Fig. 14, would be of 20.1 in Livingston, 11.1 in Hanford and 6.7 in Virgo. To get an idea of the significance of an event in a detector network, the signal to noise ratios in the different detectors are summed in quadrature [39]. That is: Using this formula, the total network signal to noise ratio of the example event, shown in Fig. 14, would be 23.9.

IV. SEARCH FOR CLOSE HYPERBOLIC ENCOUNTERS IN LIGO-VIRGO DATA
Now that in Sec. III we have determined the kind of signals that gravitational waves from close hyperbolic encounters leave in laser interferometers and how to process the data to make these signals stand out over the noise, we will want to look for them in available data of current detectors.
For this purpose we will have to select the data to be analyzed, which will be done based on data quality considerations. We will look for gravitational waves from close hyperbolic encounters in the selected data using a two step trigger. The first step of the trigger will take an approach similar to standard burst searches [24], being a theory independent loose preselection of possible candidates based in correlations between detectors in the time-frequency domain. In the second step of the trigger we will use the templates developed in Sec. II to train a neural network to look at which of the preselected events look like gravitational waves from close hyperbolic encounters.

A. Data selection and quality
Gravitational wave astronomy benefits a lot from having multiple detectors online at the same time, since the more detectors a signal is seen in, the more confidence can be had that it was of astrophysical origin, as opposed of coming from a terrestrial noise source. Because of this we will limit our search to data taken with the three detectors (LIGO Livingston, LIGO Hanford and Virgo) operational and that is publicly available [23]. At the time the analysis was done, this limits our search to less than 4 weeks of data at the end of the second observing run O2, between 2017-08-01 and 2017-08-26.
Nonetheless, laser interferometers are very delicate and sensitive machines, and because of it, the LIGO and Virgo detectors are not always in observing mode, and when they are, sometimes the data quality is not good enough for analysis. For our search we will only use the data in which the three detectors were operating under nominal conditions, that is, with no known hardware issues. The times that satisfy this condition are shown in the last row of Fig. 15, where we can see that the amount FIG. 14. Square root of the normalized energy derived from the Q transforms of the whitened signals of Fig. 13. The Q transform is computed using a quality factor Q = 8. of data rejected for our analysis is quite large, fracturing the timeline into many segments of very different lengths. From the 24.5 days that the three detectors were measuring at the same time, only 15.3 days (62% of data) will satisfy the data quality requirements and will be used for the analysis.

B. Non-stationary noise: glitches
Real laser interferometer data will not only contain stationary sources of noise as the ones discussed in Sec. III A 2, but it will also contain some transient sources of noise. These are short duration perturbations in the strain measured by the detector that come from a terrestrial origin, usually because of environmental or instrumental factors.
Transient sources of noise are much harder to clean using the signal processing methods discussed in Sec. III B because, since they are strain perturbations that have a small duration in time and are only observed once, in many ways they will behave like transient gravitational wave signals. The large transient noise events will be called "glitches". There are many different types of glitches, classified by their morphology. In Fig. 16 we show different classes of glitches as classified by Ref. [47]. The glitches are represented in the time-frequency domain following a signal processing very similar to the one explained in Sec. III B to make Fig. 14. Some of the glitches, such as blip glitches, low frequency burst glitches or chirp glitches can look similar to the gravitational waves from close hyperbolic encounters we are looking for, and they will constitute the main source of background in our search.

C. Image preselection: correlation trigger
As we have seen in previous sections, gravitational wave detectors will have a lot of noise coming from many different terrestrial sources. Nonetheless, because the detectors are hundreds of kilometers apart and do not interact with each other, we would expect that their noise will be in general uncorrelated.
In opposition to this, whenever a gravitational wave passes through the Earth, it will leave a simultaneous imprint in all the detectors (within the milliseconds of light travel time between detectors). Because of this, gravitational waves will induce correlations between the measurements of the different detectors. We will exploit this feature to generate a theory independent trigger to look for gravitational wave candidates.
Even though the antenna factors will change the amplitude and phase of the gravitational wave strain observed in different detectors, the duration and frequency of the signal will not change, and we expect to see an excess of normalized energy in the same region of the time-frequency domain for the three detectors.
We will thus look for positive correlations between the normalized energies of the detectors, that is, excess normalized energy in the same time-frequency region (within the light travel time between detectors) in the three detectors. As we will see, this method works well to look for gravitational waves from close hyperbolic encounters since in this case the strain only performs one oscillation and because of this, the normalized energy has a large area in the time-frequency domain.
The correlation between the normalized energy of two different detectors will be quantified using Pearson correlation coefficient r [48]. If x n and y n (n = 0, ..., N − 1) are the two vectors between which we want to compute the correlation, their Pearson correlation coefficient will be defined as: (105) This coefficient satisfies that if x n = |α|y n , then r = 1 and if x n = −|α|y n , then r = −1. If there is no correlation between x and y, then r = 0. In general the more positive correlation there is, the closer to 1 that r will be.
To create a trigger implementing the correlation between detectors, we will first apply the signal processing discussed in Sec. III B. That is, we will band-pass filter the signal in a 20-800Hz range, whiten it, and compute  [47]. The glitches are represented in the time-frequency domain following a signal processing very similar to the one used to make Fig. 14. The x axis corresponds to time, the y axis to frequency and the color corresponds to the square root of normalized energy as defined in Eq. (102). Image taken from Ref. [47]. the normalized energy in the time-frequency domain using the Q transform.
Since Livingston is the detector with the best sensitivity, specially at low frequencies where hyperbolic encounters deposit most of their energy, we will use it as the reference interferometer, and we will compute the correlation of Hanford and Virgo with it.
We will divide Livingston's normalized energy in images such as the one shown in Fig. 14 that will be 0.3 seconds in length and creating an image every 0.15 seconds (50% overlap). The images will be 200x200 pixels in size. Once we have the image in Livingston defined, we generate corresponding images at the same time at Hanford and Virgo, allowing for small time shifts due to the light travel time (±0.010s with Hanford and ±0.026s with Virgo).
We then compute the correlation between the normalized energy in the pixels of the images using Pearson's correlation coefficient of Eq. (105) and we pick the Hanford and Virgo time shifts that yield the highest value of the coefficient, since we are looking for positive correlations between the normalized energies. This yields a value for the correlation between Livingston and Hanford r L1−H1 and between Livingston and Virgo r L1−V 1 1 . The trigger will accept as candidate events the images that satisfy: where D is the discriminant of the trigger. The factors of 2 3 and 1 3 were chosen to give a higher weight to the correlation with Hanford, since this detector has better sensitivity than Virgo and because of it, the correlation of Livingston with Hanford is more likely to have a gravitational wave origin. With this correlation trigger we wanted to have a preselection of gravitational wave candidates to then feed to the neural network that will do the final clasification. We put the minimum value of the discriminant D to be accepted at 0.3 to reject a large amount of the noise that is randomly correlated between detectors while accepting most of the loud enough gravitational waves.
This choice of parameters in Eq. (106) is discussed in greater detail in appendix A.

Trigger validation with injections
To test how the trigger works at detecting gravitational waves from close hyperbolic encounters, we will inject signals with random parameters (as specified in Table  VI of appendix B) at random times in the experimental data. We will then run the trigger on the data containing these injections to study how many of them we are able to detect.
We will inject a total of 169108 signals with total signal to noise ratio between 4 and 40. The distribution of the injections with the signal to noise ratio is shown in Fig. 17, where we see that it follows a power law, in accordance with what is predicted in the literature [49] for any gravitational wave source with randomly generated parameters. We have also plotted in Fig. 17 the events that are recovered from the injections. To see how good the trigger is at recovering events as a function of the signal to noise ratio, in Fig. 17 we have plotted the efficiency of the trigger, defined as the fraction of injections that are recovered at a given signal to noise ratio.
At low values of the signal to noise ratio we recover only a small fraction of the injected events, because they are not loud enough to induce strong correlations between the detectors. As the signal to noise ratio of the events increases, the efficiency increases because the louder an Event Name S/N D = 2 3 rL1−H1 + 1 3 rL1−V event is, the more correlations it will induce and the bigger the discriminant will be. Above a signal to noise ratio of 10, we recover more than 50% of the injected events. The discriminant as a function of the signal to noise ratio is shown in Fig. 18. We can see that the discriminant monotonically increases as the signal to noise ratio increases. Nonetheless, it has quite a large spread, this is partly due to the effects that the different antenna factors of the interferometers have. For example if the wave comes from the blind spot of one interferometer, the correlation between this interferometer and the others will be suppressed. In addition, if the gravitational wave signal happens close to a large fluctuation of noise in a detector, since this large fluctuation of noise will not appear in the other detectors, it will diminish the correlation. This trigger will also have different sensitivities towards events with different parameters. Events at low frequencies will be easier to detect, since these low frequencies also correspond to longer times and the correlated area in the time-frequency domain will be larger.
Nonetheless, after testing the trigger with injections we can confidently say that the basic idea works for gravitational waves from close hyperbolic encounters, since the discriminant monotonically increases as the loudness of the event increases and it is able to recover most of the events above a signal to noise ratio of 10.
As an additional validation of the trigger, we note that in the 15.3 days of data we are analyzing, there were 4 gravitational wave signals claimed by the LIGO-Virgo collaboration [50], all of them coming from the coalescence of compact binary objects. Since the correlation trigger we have developed is theory independent, it should also recover these events.
The result of checking the output of the trigger at the time of these events is shown in Table I. Our trigger recovers all events except GW170818. This event is not recovered because it was mostly seen in Livingston, since it was close to the blind spots of Virgo and Hanford, suppressing the correlations with these interferometers.
We also note that even though GW170817 has a very large signal to noise ratio of 33, the trigger discriminant is not as high as we would expect from Fig. 17, this is because this event is a binary neutron star merger [5] which happens at very high frequencies and with a very large quality factor. Because of this, the area of its normalized energy will be small, making the discriminant of our trigger smaller.  18. Two dimensional histogram of the trigger discriminant and the signal to noise ratio for injected events. To obtain the trigger discriminant as a function of the signal to noise ratio and mitigate the effects of injecting different number of events at different signal to noise ratios, we have weighted each event by the inverse of the number of events injected at its signal to noise ratio.

Trigger false alarm rate
The correlation trigger that we have designed will not only accept gravitational wave events, but it will also accept noise that is randomly correlated between detectors. To estimate how many noise events will trigger on average, we will run our trigger in the LIGO-Virgo data, but we will shift the timeline of the Hanford and Virgo detectors by several seconds in such a way that if there are correlations between the data, they can not possibly come from gravitational waves (gravitational waves must be correlated within milliseconds).
By shifting the timeline of the detectors by different amounts we can get an almost arbitrarily large amount of "noise" data. To study how the trigger reacts to the randomly correlated noise in this data, the main quantity that we will look at is the false alarm rate (FAR). The false alarm rate quantifies how many events with trigger discriminant D higher than some value we can expect per unit of observing time. That is: where N events (D > x) is the number of events with trigger discriminant higher than x we observe in the noise data, and T noise is the duration of the noise data.
To study the false alarm rate with enough statistics, in total we have used the 15.3 days of true data to obtain 642 days (T noise = 1.75 yr) of noise data by shifting the timelines of Hanford and Virgo by different amounts. The result for the false alarm rate obtained is shown in Fig. 19, where we have fitted it to a smooth curve to guide the eye. We can observe that the value of the false alarm rate exponentially decreases as the discriminant increases. This is what we would expect of a well behaved trigger, events with high values of the discriminant are increasingly more rare because randomly having very strong correlations between detector noise is unlikely to happen.
Having the false alarm rate, we can estimate how many events we could expect to see above a certain discriminant as a function of the observing time T observed : This expected number of events can be compared with the number of events one observes when running the trigger in the data without applying any time shifts. When doing this we got 2704 observed events having the distribution with the trigger discriminant shown in Fig. 20. We can see that there is a more than 3σ excess in the observed events at large correlation values. This is due to one event with a discriminant value of D = 0.647, which from Fig. 19 we can see it corresponds to a false alarm rate of 1.5 yr −1 , that is, we expect one such event every 240 days, but we got one in 15.3 days.
This event is shown in Fig. 21. Even though it has a very large value of the correlation, this does not necessarily mean it will be a gravitational wave signal. As was discussed in Sec. III B 3, we would expect Virgo to have in general the smallest normalized energy, since it has by far the largest noise power spectrum distribution and we are dividing by this quantity when whitening the signal. Nonetheless, in this case it has the largest normalized energy. This could be due to the fact that the gravitational wave is coming from close to a blind spot of Hanford and Livingston but right in the most sensitive part of the sky for Virgo. To confidently determine whether or not this triggered event, as well as all the others, can come from gravitational waves of close hyperbolic encounters, we will need to further analyze the triggered events with a method that takes into account the properties of the signal. This will be done in next section using a neural network.

D. Neural network
To determine which of the 2704 events that were selected by the correlation trigger could have been produced by gravitational waves coming from close hyperbolic encounters, we will need to compare them with the theoretical predictions developed in Secs. II and III. This will be done by training a neural network to classify the images of the normalized energy selected by the correlation trigger into two classes, "noise" images and "CHE" (close hyperbolic encounter) images.
To do this we will first have to specify the characteristics of the images we want to classify, and based on this we will design the architecture of our neural network. We will then have to train this neural network and test and validate its performance. Finally, when we are confident on the validity of the approach, we will classify the data images and find the most promising close hyperbolic encounter candidates.

Image samples
We will need to generate image samples to train and validate our neural network as well as to make predictions on the data. Careful considerations need to be taken to generate all the images in the same way so as to not introduce biases in our samples. This is done by running the same image generation code on different sets of data.
The code in question will run the correlation trigger described in Sec. IV C and for each event that triggers it will generate an image centered around that event, such as the ones shown in Fig. 22.
The images are 50 × 150 pixels in size because we vertically stack a 50 × 50 pixel image of each of the three detectors into the same image (Livingston in the first row, Hanford in the second row and Virgo in the bottom). This is done to feed the information about the three detectors at the same time to the neural network, and in this way be able to classify the event with all the available information. For each detector, the x axis corresponds to 0.7 seconds around the trigger time, while the y axis corresponds to logarithmically spaced frequencies between 20 and 800 Hz.
With the pixel color we represent the natural logarithm of the normalized energy on a fixed scale between 1 and 9. This fixed scale is desirable for a neural network, since it gives an absolute meaning to the color. We put the minimum value of the scale at 1 (which corresponds to normalized energy 2.7), above the noise floor to avoid Examples of images used to train and validate our neural network as well as to make predictions on the data. Left: We show a gravitational wave with a signal to noise ratio of 10.9 that was injected in data where the Hanford timeline had been shifted by +2s and the Virgo one by -1s. Center: We show an event that triggered when running the program on the data where the Hanford timeline had been shifted by +5s and the Virgo one by -5s. Right: We show the same candidate of Fig. 21 that was the most correlated event we found when running the trigger on the untouched data.
confusing the neural network with the many small fluctuations around this noise floor. The natural logarithm is chosen to increase the dynamic range we can fit in the image. Finally, the image will be represented in grayscale because we are only considering one quantity, the normalized energy, which can be represented in grayscale using only one channel. This grayscale will reduce the memory requirements by a factor of 3 with respect to using a colored "RGB" representation.
The noise samples are generated by running the trigger and image generator on the same 1.75yr of data of Sec. IV C 2 with the Hanford and Virgo detectors shifted in time. We obtain 112153 noise images from this shifted data, from which 64028 (corresponding to 1yr of data) will be used to train the neural network and 48125 (corresponding to 0.75yr of data) will be used for validation.
The close hyperbolic encounter (CHE) samples are generated by running the trigger and image generator on data containing the same 169108 injections of Sec. IV C 1, with signal to noise ratios between 4 and 40 and whose parameters are randomly generated as specified in Table  VI of appendix B. Before injecting the signals, the data of the Hanford and Virgo detectors is shifted in time, to avoid having the same background as for the true data samples. We will use different time shifts from the ones used to generate the noise images. In this way we obtain 89597 CHE images from the shifted data with injections, from which 45356 will be used to train the neural network and 44241 will be used for validation.
The data samples are generated by running the trigger and image generator on the synchronized LIGO-Virgo data. This will yield 2704 images corresponding to the accepted events of the first level trigger that we want to classify.

Neural network design
Neural networks are mathematical models used to tackle complex data analysis problems with Machine Learning. The design of the neural networks has been motivated by the functioning of the brain and just like in the brain, the basic building block of a neural network is the neuron [51]. In this context, a neuron is a function that takes a series of inputs x, weighs them with a series of weights w, adds a bias b and evaluates the result on an activation function φ. That is: where x and w are vectors of R n , with n being the number of inputs of the neuron. For the activation function different choices can be made. Some frequently used activation functions are shown in Table II.
To form a neural network, neurons have to be connected with each other. This can be done in many ways, but for our analysis we will focus on Feedforward Neural Networks as the one shown in Fig. 23. In this type of neural network, neurons are grouped into layers. The output of the neurons of each layer is fed as an input for the neurons of the next layer.
The first layer of the network is called the input layer, since it represents the data we want to feed to the neural network to analyze. In the case of image recognition, this input corresponds to the values of the image pixels.
The last layer of the network is the output layer, which will give the result of the neural network. In a problem of binary classification we will only need one output, between 0 and 1, that will represent the probability for the input image to come from a close hyperbolic encounter. The neurons in the layers between the input and the output are used only for the internal calculations of the network. This is why these layers are called hidden layers.
At initialization, we will have to set the connections and the activation functions of the neurons. Depending on the design of the neural network, some of the weights and biases can also be set at initialization, but usually a large amount of weights and biases are left as free internal parameters. These internal parameters will be tuned during the training process to optimize the network at fitting the training data.
The best suited type of network for image classification is the Convolutional Neural Network, which will be the one that will be used in our analysis to classify the data images. Convolutional Neural Networks are a type of Feedforward Neural Network characterized by the fact that neurons of one layer are not connected with all the neurons of the next layer, but they are only connected with the nearby neurons. This is desired because in images, pixels are usually locally very correlated, that is, their value strongly depends on nearby pixel values. Instead of analyzing all the pixels at the same time without considering their position in the image, with Convolutional Neural Networks we will extract local patterns in progressively large scales of the image and relate them to each other. This local connectivity will greatly reduce the number of free parameters and make the training more simple and generalizable. To construct a Convolutional Neural Network we will make use of three different types of layers [51].
The first type of layer we will need will be the Convolutional layer. This layer will be defined by a number K of "kernels", each one of which is a matrix of size N ×N whose parameters will be determined during training. The neurons of the convolutional layer will perform the convolution operation between one of the K kernels and a N × N region of the previous layer matching the kernels size. We will also need to specify the stride S, which is the number of pixels in the previous layer by which the kernel center is moved to perform the next convolution. If the stride is greater than 1, it can be used to make the output of the convolutional layer smaller than its input and thus compress the information. To avoid miss-match effects we can define some zero-padding P , to pad zeros around the border of the input. Additionaly, a non-linear activation function such as ReLu or Sigmoid (Table II) can be applied to the result of the convolution.
The fact that neurons in a convolutional layer share the same weights through the kernels greatly reduces the number of free parameters. The sharing of weights means that the convolutional layer will extract local pixel patterns in a location independent way. This is useful because usually local pixel values are highly correlated (in our case they vary smoothly) and their individual values do not matter as much as the local pattern. Additionally, the relevant features we are looking for are usually location invariant in the image. When the neural network is optimized, it finds the kernels that highlight the most important features for the next layers.
Convolutional Neural Networks will also make use of Pooling layers, which are usually placed after convolutional layers to reduce the dimension of their output conserving as much information as possible. Pooling layers work in a similar way to convolutional layers, having a stride S, a zero-padding P and a pooling window size N × N that need to be specified. Nonetheless, the pooling operation is specified, rather than learned. The most common pooling methods are averaging over the pooling window or computing the maximum on each pooling window. To reduce the dimension of the input conserving as much information as possible, usually the stride has a value larger than 1, and the window size is larger than the stride. Pooling can introduce spatial invariance to the neural network, since pooling operations such as averaging or finding the maximum do not depend on the positions of the inputs inside the pooling window.
Lastly, Convolutional Neural Networks will also have fully connected layers, which are layers like the ones shown in Fig. 23, where the neurons of the fully connected layer are connected with all neurons of the previous layer. After specifying an activation function for the neurons, all weights and biases are free to vary independently. These layers offer great learning flexibility because of the many degrees of freedom they have from the maximal amount of connections. Having so many degrees of freedom can come at the cost of leading to overfitting (when the network fits the training data too specifically and can not generalize to different data). The increase in the number of parameters can also be a problem, not only because of the much greater amount of time it will take to train the network, but also because the training can be degraded.
This degraded training is a general problem of neural networks with many layers and free parameters and was noted in Ref. [52]. The basic problem is that if a network is sufficiently complex to find and relate all the relevant features of an image, adding more complexity will not benefit the result in any way, but it will just make the learning process more complicated and will result in general in worse performance.
To tackle this problem, the authors of Ref. [52] developed a type of Convolutional Neural Network called Residual Neural Network (ResNet). The building block of these networks is the Residual block, schematically shown in Fig. 24. The input of the block x will go through two paths, one path passes through different layers that transform it into F(x), which is called the residual, while the other path does not get transformed. At the end of the block, the two signals are summed. If the network already has computed all relevant features of the image, it will be easy for the network to do the identity mapping without degrading the signal by optimizing the residual F(x) towards 0. Because of this, residual neural networks usually do not get worse with increased number of layers, they plateau.
To classify the input images into noise or gravitational waves from close hyperbolic encounters, we will use a neural network based on one of the most refined architectures of Residual Neural Networks there is, called ResNet-50 (because it has 50 hidden layers). The actual network architecture we will use is detailed in Table III, where the first layer of the network will take as its input the pixels from the 50 × 150 pixel image we want to classify. Finally, note that the last layer will return one single number, and since the activation of this last neuron is a Sigmoid function, this number will be between 0 and 1 (see Table II). It will be interpreted as the probability for the image to be generated by gravitational waves from close hyperbolic encounters.
The actual implementation of the neural network will be done using the tensorflow [53] library of Python.

Training and validation
Even though we have defined in the previous section the architecture of the neural network that will be used to classify the images, this network design will contain a lot of free parameters with undetermined values. We will have to determine the optimal values of these parameters such that whenever we input an image with the format described in Sec. IV D 1 that contains gravitational waves from a close hyperbolic encounter, the neural network returns a 1 and whenever we input images that contain only noise, the neural network returns a 0.
These optimal values of the free parameters are determined by "fitting" the neural network to the training images, described in Sec. IV D 1, whose class ("noise" or "CHE") is known. How well the neural network fits the data will be quantified via a loss function that we will want to minimize. As our loss function we will use the binary cross-entropy between true label values and the neural network predictions. This is the most commonly  N ), number of different kernels (K), stride (S)" and when not specified we are using a ReLu activation function. We have put the convolutional layers of each residual block between brackets and the multiplicative number outside the brackets represents how many times this block is repeated. The first convolutional layer of blocks 4, 8 and 14 will have a stride S = 2 to halve the width and height of the output, while the other convolutional layers in the residual blocks have a stride S = 1.
used loss function for binary classifiers [54] and it is computed in the following way: where N is the total number of images we are evaluating, q i are the true labels (0 for noise and 1 for CHE) and p i is the output of the neural network (we want it to be 0 for noise and 1 for CHE). Note that the closer the predictions p i are to the real values q i , the smaller the binary cross-entropy will be, taking a value of 0 whenever p i = q i . Minimizing this loss function will thus produce predictions closer to the real values. For the optimization of the parameters of the neural network to minimize the loss function on the training data we will use the Adam method for stochastic optimization [55]. Adam is an algorithm for fist-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower order moments. We will use the Python implementation of this algorithm in tensorflow [53], setting a learning rate of α = 0.001 a batch size of 32 and applying 12 training epochs. The result of training the neural network on the 64028 noise training images and 45356 CHE training images is shown in Fig. 25.
On Fig. 25 we have plotted the loss as a function of the training epoch, computed both for the training as well as for the test images used for validation. We can observe how the training loss monotonically decreases, as we would expect from the fact that we are optimizing the network to minimize this quantity. More interesting is to observe the behavior of the loss when testing the network on the validation samples, which are not being looked at to optimize the network. We observe that even though this test loss fluctuates somewhat at small training epochs, at larger epochs it converges towards small values of the loss, around 0.04, close to the training loss. This means that the neural network is being able to correctly generalize from the training data to data it has not previously looked at and that it is not suffering from overfitting.
On Fig. 25 we also represent the evolution of the accuracy of the neural network as a function of the number of training epochs, where the accuracy is the number of images correctly classified (assuming that p ≤ 0.5 corresponds to noise and p > 0.5 corresponds to CHE) divided by the total number of images. The accuracy of the neural network on the training images generally increases with the training epoch, reaching a final value of 98.8%, which means that the optimization of the loss is translating into a very accurate classification of the images. Looking at the accuracy in the test images, we observe that after the 12 epochs it reaches 98.7%, close to the training value. This further validates the fact that the neural network is learning the correct patterns to classify the images during training and it is able to apply them to data it has not previously looked at.
A very useful representation to visualize the performance of a binary classifier is through the "receiver op- erating characteristic" (ROC) curve [56]. This is a parametric curve in which we show the evolution of the rate of true positives 2 and of the rate of false positives 3 when varying between 0 and 1 the minimum probability we require to consider an event to be from a close hyperbolic encounter. The ROC curve of our neural network is shown in Fig. 26. If we look at the upper left corner of Fig. 26, we can see that given the right probability threshold, we can obtain very large values of the true positive rate at very low values of the false positive rate. This is the desired operation point of any classifier and it is indicative of the fact that our neural network is a very good binary classifier. 2 The rate of true positives is the number of test CHE images correctly identified as CHE images divided by total number of CHE images tested. 3 The rate of false positives is the number of test noise images incorrectly identified as CHE images divided by total number of noise images tested.  Positive/Negative means that the event has been classified as CHE/noise. True/False means that the event has been correctly/incorrectly classified. The false positive rate derived from this table will be 0.28%, while the true positive rate will be 96.59%.
For our purposes we will choose a high value of the probability threshold to consider that an event is a CHE, requiring that p > 0.9. This will be done to reduce a lot the number of false positives that we will obtain and thus select only the most promising events. The truth table one obtains when running the neural network in the validation samples with this 0.9 threshold is shown in Table  IV, where we can observe how the Neural Network with the 0.9 threshold greatly suppresses the false positives without having too many false negatives. Since we obtained 135 false positives in the 48125 test noise images, corresponding to 275 days of noise data, the false alarm rate of this neural network trigger will be 0.49 days −1 , that is, if the data contained only noise, we would expect a false trigger every 2 days.
Nonetheless, we do not expect all the signals to be equally easy to detect. Just as in the case of the correlation trigger, we would expect that the close hyperbolic encounters with high signal to noise ratio will be easier to identify. To test this, in Fig. 27 we show the distribution in signal to noise ratio of the CHE images used for testing the neural network as well as the distribution of the subset of these images that is correctly identified by the neural network. Dividing the number of events detected at each signal to noise ratio by the total number of events tested at that signal to noise ratio, we can get the "trigger efficiency" of our neural network. This is shown in Fig. 27, where we see that the efficiency at very low values of the signal to noise ratio is quite small, meaning that we detect only a small fraction of faint CHE events. Nonetheless, it rises quickly with the increasing signal to noise ratio and we are able to detect more than 50% of the events above a signal to noise ratio of 5.
The relatively bad performance at very low signal to noise ratio is expected because of the fact that such faint events leave a very small signal in the detector that is hard to distinguish from the correlated noise fluctuations.

Results on the data
Now that we have validated the neural network, checking that it recovers most of the close hyperbolic encounters (CHE) with signal to noise ratio above 5 while having a false alarm rate of 0.49 days −1 , we will want to run the neural network on the 2704 data events that passed the correlation trigger requirements, and check how many of them are determined by the neural network to have a probability p > 0.9 to be close hyperbolic encounters.
In Fig. 28 we show a histogram of the number of events we obtain at each value of the CHE probability when running the neural network on the data. If we compare the number of observed events with the number of events we would expect if only noise was present on the data, we can observe that at high values of the CHE probability, we consistently observe more events than expected, but almost all excesses are within the one standard deviation region and are thus not statistically significant.
The last bin of Fig. 28 corresponds to the events with p > 0.9 that are determined by the neural network to be the most likely to come from close hyperbolic encounters. From the validation of the neural network on the noise test images we would expect to have 7.5 ± 2.7 events above this 0.9 probability if the 15.3 days of analyzed data contained only noise. When we actually run the neural network on the data we obtain 8 such events, whose properties are detailed in Table V and are shown in Figs. 29-36. This number of observed events is well within the expected range if there was only noise in the data. Nonetheless, this will not mean that individual events within the selected candidates will not be able to come from hyperbolic encounters.
The two most promising events according to the neural network are in Fig. 29 and Fig. 30. However, if we look at them, these events have a chirp-like shape, with the frequency increasing with time. This is because they correspond to GW170814 and GW170809 [50] respectively, which are claimed by LIGO-Virgo collaboration to come from the coalescence of black hole binaries. Nonetheless, since they are both very massive binaries, the last oscillations before merger can look like close hyperbolic encounters, which is tricking the neural network. Additionally, these two signals come from gravitational waves, and therefore have the correct time and intensity relationships between interferometers that the neural network will be looking for. The fact that we recover these events so strongly is a further validation of the neural network. If we wanted to reject events coming from black hole binary coalescence, we could train the neural network with examples of these events labeled as "noise".
The event GW170817 that did pass the correlation trigger (Table I) is not accepted by the neural network because it comes from the coalescence of neutron stars and therefore the merger happens at very large frequencies (well above 800Hz). This means that the last oscillations happen outside the image, which makes this type of event more difficult to mistake with a close hyperbolic encounter.
The rest of the events found by the neural network, shown in Figs. 31-36 are not claimed by the LIGO-Virgo collaboration. These events look much more like the close hyperbolic encounters we are looking for, described in Sec. III B. Nonetheless, the number of events we find is consistent with them coming from random correlated glitches. To decide whether or not these events can come from close hyperbolic encounters, further analysis would be required. This further analysis usually consists on estimating the parameters of the encounter using Bayesian inference [57] and checking that the results obtained for the parameters make physical sense and that the Bayes factor against the noise hypothesis is large.

V. SUMMARY AND CONCLUSIONS
The objective of this paper was to search for black hole hyperbolic encounters in current gravitational wave detectors. These encounters are expected to happen in dense black hole clusters when two black holes gravitationally scatter off each other. If the black holes get sufficiently close to each other during the scattering they will emit bremsstrahlung gravitational waves that can be observed at current Earth-based laser interferometers.
To determine the gravitational radiation we could expect from these close hyperbolic encounters, in Sec. II we developed 1.5 post newtonian accurate templates for the gravitational waves emitted by spinning compact binaries in hyperbolic orbits, taking into account up to leading order spin effects, and we showed how to numerically implement these templates in an efficient way using Python.
In Sec. III we studied how these gravitational waves interact with the network of gravitational wave detectors currently present on Earth, deriving the antenna factors for laser interferometers. Since real detectors are dominated by noise, we discussed how to use data processing techniques to make the signal of the hyperbolic encounters stand out over the noise. Namely we bandpassed the data between 20Hz and 800Hz, whitened it with the noise power spectrum distribution and Q transformed it to show the characteristic dependence in the time-frequency domain.
Having determined the signal we were looking for, in Sec. IV we developed a two level trigger to extract the possible hyperbolic candidates from the 15.3 days of publicly available data in which the three detectors were under nominal operation. The first level of the trigger consisted on a loose selection based on correlations between detectors and tuned to accept possible close hyperbolic encounter event candidates while doing a large reduction of the data. This part of the trigger was tested and validated using injections, which showed that we are able to recover most of the events with signal to noise ratio above 10. The second level of the trigger consisted of a convolutional neural network to classify the 2704 events accepted by the first level trigger into either noise or close hyperbolic encounters. This neural network was trained using images of simulated close hyperbolic encounter signals as well as correlation triggers between unsynchronized data. The neural network was validated using test images, which showed that it is able to recover most of the close hyperbolic events that passed the first trigger and that have a signal to noise ratio above 5, while having a false alarm rate of 0.49 days −1 .
When running the full trigger on the 15.3 days of data we obtained 8 hyperbolic event candidates, consistent with the number of false positives expected if only noise was present on the data, of 7.5 ± 2.7. The two most promising events according to the neural network corre-spond to GW170814 and GW170809 respectively, which come from the coalescence of black hole binaries, whose last oscillation can look like a close hyperbolic encounter, specially if we do not train our network to reject this coalescence type of events. The six remaining candidates did look more like the close hyperbolic encounters we were looking for. Nonetheless, to decide whether or not these events can come from close hyperbolic encounters, further analysis would be required. This further analysis could consist on estimating the parameters of the encounter using Bayesian inference and checking that the results obtained for the parameters make physical sense and that the Bayes factor against the noise hypothesis is large. Now that we have developed and validated a method to find close hyperbolic event candidates using a correlation trigger and a neural network, this method can be used to analyze the gravitational wave data of future observing runs. If there are gravitational waves from close hyperbolic encounters at a rate sufficiently large to explain the small excesses seen in Fig. 28, then analyzing more data we would observe a larger excess of events above the number of expected events, that at some point would be statistically significant. In addition, these future observing runs are also expected to have much better detector sensitivities, which would reduce the background of our search.
To translate the constraints on the rate of close hyperbolic encounters observed into information about cluster dynamics, more work is required in the theoretical understanding of the nature and structure of these clusters, to determine the expected rate and parameters of the hyperbolic encounters that will take place within the clusters.
FIG. 29. Event with neural network CHE probability p = 0.997 and correlation trigger discriminant D = 0.469. This event corresponds to GW170814 [50], which is claimed by LIGO-Virgo to be the coalescence of two black holes of masses 31M and 25M .
FIG. 30. Event with neural network CHE probability p = 0.991 and correlation trigger discriminant D = 0.337. This event corresponds to GW170809 [50], which is claimed by LIGO-Virgo to be the coalescence of two black holes of masses 35M and 24M .
which is computed for the same injections as the variance.
In Fig. 37 we also show the loss as a function of the weight of the correlation between Livingston and Hanford, where we observe that the loss has a minimum of 1.798 for a = 0.622, which has D 0 = 0.296.
Looking at Fig. 37, we can observe that both the variance and the loss have their minima near each other. In particular, making the choice of a = 2 3 that was used in the section Sec. IV C, the two quantities take values very close to their minimum, which is why this choice was made. We then choose D 0 = 0.3 to keep at around 9 the signal to noise ratio at which 50% of events are detected.

Appendix B: Injection Parameters
In Table VI we show the ranges of parameters that we have used for the injections all through this paper.