On the Progenitor of Binary Neutron Star Merger GW170817

On 2017 August 17 the merger of two compact objects with masses consistent with two neutron stars was discovered through gravitational-wave ( GW170817 ) , gamma-ray ( GRB  170817A ) , and optical ( SSS17a / AT 2017gfo ) observations. The optical source was associated with the early-type galaxy NGC 4993 at a distance of just ∼ 40 Mpc, consistent with the gravitational-wave measurement, and the merger was localized to be at a projected distance of ∼ 2 kpc away from the galaxy ’ s center. We use this minimal set of facts and the mass posteriors of the two neutron stars to derive the ﬁ rst constraints on the progenitor of GW170817 at the time of the second supernova ( SN ) . We generate simulated progenitor populations and follow the three-dimensional kinematic evolution from binary neutron star ( BNS ) birth to the merger time, accounting for pre-SN galactic motion, for considerably different input distributions of the progenitor mass, pre-SN semimajor axis, and SN-kick velocity. Though not considerably tight, we ﬁ nd these constraints to be comparable to those for Galactic BNS progenitors. The derived constraints are very strongly in ﬂ uenced by the requirement of keeping the binary bound after the second SN and having the merger occur relatively close to the center of the galaxy. These constraints are insensitive to the galaxy ’ s star formation history, provided the stellar populations are older than 1


Introduction
The era of observational gravitational-wave (GW) astronomy was firmly marked by the detection of the first binary black hole coalescence GW150914 (Abbott et al. 2016) by the Advanced LIGO detectors (Aasi et al. 2015). Discovery of a GW source accompanied by coincident electromagnetic (EM) emission, however, remained elusive until now.
On 2017 August 17 the Advanced LIGO (Aasi et al. 2015) and Advanced Virgo (Acernese et al. 2015) interferometer network recorded a transient GW signal consistent with the coalescence of a binary neutron star (BNS) GW170817 (Abbott et al. 2017b). Independently, a gamma-ray signal, classified as a short gamma-ray burst (sGRB), GRB170817A, coincident in time and sky location with GW170817 was detected by the Fermi-GBM instrument (Abbott et al. 2017a(Abbott et al. , 2017b. The three-detector GW data analysis led to the smallest skylocalization area ever achieved for a GW source: ;31 deg 2 when initially shared with the astronomy LIGO-Virgo partners (LIGO Scientific Collaboration & Virgo Collaboration 2017) and later improved to ;28 deg 2 with a fully coherent data analysis (Abbott et al. 2017b).
Aided by the tight localization constraints of the threedetector network and the proximity of the GW source, multiple independent surveys across the EM spectrum were launched in search of a counterpart beyond the sGRB (Abbott et al. 2017c). Such a counterpart, SSS17a (later IAU-designated AT 2017gfo), was first discovered in the optical less than 11 hours after merger, associated with the galaxy NGC 4993 (Coulter et al. 2017a(Coulter et al. , 2017b, a nearby early-type E/S0 galaxy (Lauberts 1982). Five other teams made independent detections of the same optical transient and host galaxy all within about one hour and reported their results within about five hours of one another (Allam et al. 2017;Arcavi et al. 2017aArcavi et al. , 2017bLipunov 2017b;Tanvir & Levan 2017;Yang et al. 2017;Soares-Santos et al. 2017;Lipunov et al. 2017a). The same source was followed up and consistently localized at other wavelengths (e.g., Corsi et al. 2017;Deller et al. 2017aDeller et al. , 2017bDeller et al. , 2017cGoldstein et al. 2017;Haggard et al. 2017aHaggard et al. , 2017bMooley et al. 2017;Savchenko et al. 2017;Alexander et al. 2017;Haggard et al. 2017c;Goldstein et al. 2017;Savchenko et al. 2017). The source was reported to be offset from the center of the galaxy by a projected distance of about 10″ (e.g., Coulter et al. 2017aCoulter et al. , 2017bHaggard et al. 2017aHaggard et al. , 2017bKasliwal et al. 2017;Yang et al. 2017;Yu et al. 2017). NGC 4993 has a Tully-Fisher distance of ∼40 Mpc (Freedman et al. 2001; NASA/IPAC Extragalactic Database 164 ), which is consistent with the luminosity distance measurement from gravitational waves (40 14 8 -+ Mpc). Using the Tully-Fisher distance, the ∼10″ offset corresponds to a physical offset of ;2.0 kpc. This value is consistent with offset measurements of sGRBs in other galaxies, though below the median value of ∼3-4 kpc (Fong et al. 2010;Fong & Berger 2013;Berger 2014).
BNS systems were first revealed with the discovery of PSR B1913+16, the first binary radio pulsar ever detected (Hulse & Taylor 1975). This immediately triggered new ideas for how such close pairs of neutron stars can form in nature (De Loore et al. 1975;Flannery & van den Heuvel 1975;Massevitch et al. 1976;Clark et al. 1979), based on models for the formation of high-mass X-ray binaries (van den Heuvel & Heise 1972;Tutukov & Yungelson 1973) and Wolf-Rayet X-ray binaries, for which strong orbital shrinkage is needed (van den Heuvel & De Loore 1973). With years of pulsartiming observations PSR B1913+16 provided the first firm evidence that GWs existed (Einstein 1916(Einstein , 1918 and were emitted by close binary compact objects (Taylor & Weisberg 1982). This discovery greatly motivated the efforts to directly detect GWs with laser-interferometric detectors and made BNS coalescence events key targets in GW searches (see Abadie et al. 2010 for an overview).
The formation of close binaries with two neutron stars that will merge within a Hubble time is now understood to require complex evolutionary sequences of massive binaries that involve stable and unstable mass-transfer phases and two core-collapse supernova (SN) explosions through which the binary system survives (for reviews, see, e.g., Kalogera et al. 2007;Postnov & Yungelson 2014;Tauris et al. 2017). In particular, the SN explosions that lead to the formation of neutron stars are expected to develop asymmetries during the collapse, either due to neutrino emission or an anisotropic explosion (e.g., Kusenko and Segrè 1996;Janka et al. 2007;Janka 2013). This anisotropy imparts linear momentum on the stellar remnant, known as an SN kick or natal kick.
Strong evidence for this process comes from observations of Galactic pulsar proper motions, which indicate some neutron stars are moving substantially faster than the inferred speed of their progenitors and must receive a large SN kick of ∼400-500 km s −1 at birth (Lyne & Lorimer 1994;Kaspi et al. 1996;Arzoumanian et al. 2002;Chatterjee et al. 2005;Hobbs et al. 2005;Verbunt et al. 2017). However, comprehensive studies of the known BNS systems in the Milky Way have shown that some neutron stars, particularly those in binary systems, might receive smaller kicks than their isolated counterparts (Podsiadlowski et al. 2004;van den Heuvel 2007).
About a decade after the Hulse-Taylor discovery, mergers of two neutron stars were proposed as a potential source of GRBs (Goodman 1986;Paczynski 1986;Eichler et al. 1989;Narayan et al. 1992), especially those of short duration (Kouveliotou et al. 1993). Since the discovery of host galaxies for short GRBs in 2005 (Berger et al. 2005;Fox et al. 2005;Gehrels et al. 2005;Hjorth et al. 2005;Villasenor et al. 2005), substantial evidence had accumulated in support of this hypothesis. For example, many sGRBs have a significant offset relative to the center of their host galaxy (see, e.g., Troja et al. 2008;Fong et al. 2010;Church et al. 2011;Behroozi et al. 2014): this suggests that the progenitors of these sources have migrated from their birth sites to their eventual explosion sites. Specifically, the offset distribution, together with the locations of sGRBs relative to the stellar light of their hosts, are indicative of systemic kicks (see, e.g., Berger 2014). To date, GW170817 is the strongest observational evidence for an extragalactic BNS system and the first GW signal confidently coincident with an sGRB (Abbott et al. 2017a).
In this study, we focus on constraining the immediate progenitor of GW170817 right before the second SN (SN2) that formed the BNS system. We use (i) SN-kick dynamics and kinematic modeling within the host galaxy from SN2 to merger, and (ii) the GW-measured neutron star masses, the identification of the source host galaxy, and its projected distance offset from the galactic center based on the early optical detections (Section 2). We emphasize that we develop this analysis using the very limited knowledge about the galaxy properties available in the literature prior to the announcement of the GW170817 discovery, as at this time we do not have access to the new analysis of galaxy characteristics and star formation history. We present our main results for constraints on the SN kicks, progenitor masses, pre-SN semimajor axes, and galactic radii of SN2 in Section 3, and we explore the sensitivity of our results to all our input assumptions. We find that the constraints are (i) primarily dictated by the requirement that the progenitor remains bound after SN2 and (ii) insensitive to the star formation history of the host galaxy, provided stellar ages are longer than ;1 Gyr. In Section 4, we use the GW BNS merger rate to estimate a BNS formation efficiency for NGC 4993, comment on the role of NGC 4993ʼs globular cluster content in BNS formation, and conclude our analysis.

Analysis Methodology
To investigate the constraints that can be placed on the progenitor of GW170817, we develop a modeling approach comprised of the following elements: (i) assume a gravitational-potential model for the host galaxy, described by a stellar and dark-matter (DM) density profile; (ii) place binary systems in the galaxy according to the stellar profile, and give them a pre-SN orbit in the galaxy; (iii) sample the pre-SN binary properties (pre-SN semimajor axis, progenitor mass of the second neutron star, location of SN2 within the galaxy) and the SN-kick velocity imparted on the binary following from SN2, using multiple assumptions about the underlying distribution of these parameters; (iv) sample the post-SN masses from GW parameter-estimation posterior samples of GW170817; (v) determine if the binary remains bound after SN2 and calculate the post-SN orbital properties, systemic velocity, and inspiral time, assuming two-body orbital mechanics and an instantaneous SN explosion; (vi) evolve the system forward in time, following the trajectory of the binary through the static galactic potential until it merges; (vii) select the systems with a projected offset at merger consistent with the GW170817 measurements, and label them as GW170817-like; (viii) impose constraints based on the age at which the binary formed (thus, its delay time between SN2 and merger) and the true (unprojected) distance from the galactic center, and investigate how such constraints affect our inference on progenitor properties; (ix) repeat the above steps for different input assumptions of the progenitor properties to assess the robustness of our results.
For each set of input assumptions, we evolve 50 million binaries according to the above procedures, which is sufficient to properly sample the distributions of GW170817-like systems. This section provides the model details that are adopted in our analysis.

Source Properties
The orbital-dynamics and kinematic analyses presented here require both GW and EM information. The post-SN orbital characteristics of a binary, such as the semimajor axis, eccentricity, and systemic velocity, depend on the component masses of the binary, which are measured in the GW inspiral. The projected offset of the binary relative to NGC 4993ʼs center, measured by EM observations, allows us to select GW170817-like systems in the model populations.
The best-measured property of a GW inspiral is a combination of the component masses known as the chirp mass, as it determines the leading-order frequency evolution of a GW signal (Cutler & Flanagan 1994;Blanchet et al. 1995). As the binary orbit shrinks and the orbital period decreases, the GW phase becomes progressively influenced by relativistic effects that are related to the mass ratio. Due to its higher-order contribution, the mass ratio is constrained to a lesser degree than the chirp mass. The measurement of these two parameters are used to extract the component masses of the binary. GW170817 had a measured primary mass of 1.36-1.60 M e and a secondary mass of 1.17-1.36 M e , using low-spin priors isotropic in orientation and with a<0.05, where a is the dimensionless spin parameter (see Abbott et al. 2017b for more details). Such low-spin priors are consistent with measured spins in Galactic BNS systems (Brown et al. 2012). We sample posterior distributions of these component mass measurements for each binary realization and assume that the secondary neutron star is the result of SN2.
The location of the source is measured with optical and X-ray observations to an accuracy of 0 5 (Coulter et al. 2017a(Coulter et al. , 2017bHaggard et al. 2017aHaggard et al. , 2017bKasliwal et al. 2017;Yang et al. 2017;Yu et al. 2017). We combine the information in these references with the range of distances reported in NED and adopt a projected offset distance of ;2.0±0.2 kpc for our analysis.

Galactic Model for NGC 4993
To approximate the galactic potential of NGC 4993, we employ the Hernquist density profile (Hernquist 1990) for the stellar component and the Navarro-Frenk-White (NFW) density profile (Navarro et al. 1996) for the DM halo. We use the stellar profile for sampling the location of binaries within the galaxy, and both the stellar and DM profile for calculating the pre-SN circular galactic velocity and evolving the post-SN binaries in the combined static potential.
The Hernquist profile has a density distribution given by where M å is the total stellar mass and a bulge is a scale length (Hernquist 1990). This profile satisfies de Vaucouleurs R 1 4 law, an empirical law for the luminosity as a function of radius for early-type galaxies (de Vaucouleurs 1948). Solving Poisson's equation for the gravitational potential yields The value for the scale length can be computed numerically in terms of the half-light radius (R eff ) as a R 0.55 bulge eff » (Hernquist 1990).
The NFW profile is one of the most commonly used profiles for representing the density distribution of DM halos: ) (Lim et al. 2017). This stellar mass is derived using K-band luminosity of the galaxy from the 2MASS Redshift Survey (Huchra et al. 2012), and the relationship between stellar mass and K-band luminosity from the EAGLE simulation (Schaye et al. 2015 ) (Lim et al. 2017). In addition to stellar and halo masses, we use measurements of the half-light radius of NGC 4993, R eff , which is used in the Hernquist profile. The measured value of R eff for NGC 4993 is provided in galaxy surveys (e.g., Lauberts & Valentijn 1989) and was recently reported as 2.8 kpc (Yu et al. 2017), indicating that the merger occurred at a projected distance of R 0.71 eff from the NGC 4993 center. With the above information, we construct a simple model for the galactic potential of NGC 4993 to be used in our kinematic modeling.

Orbital Dynamics with SN Kicks
We consider the effects of the SN explosion on the orbital dynamics, assuming it is an instantaneous event which imparts a SN kick to the newly formed neutron star and a mass-loss Figure 1. Enclosed mass density (left axis, blue/green/gray) and circular velocity (right axis, orange) profiles for our model galaxy. Stellar mass follows a Hernquist profile (Equation (1)) and dark matter an NFW profile (Equation (3)); note that here we plot the average enclosed mass density for a sphere of radius r rather than the mass density at radius r. The vertical line marks the projected offset of GW170817, which is a lower limit on the true distance of GW170817 from the center of NGC 4993.
kick (often referred to as a Blaauw kick; Blaauw 1961) on the companion neutron star in the binary. We ignore the effects of the first SN (SN1) on the trajectory and orbital properties of the system. The primary reason for this is that previous studies have shown that post-SN1 systemic velocities are small (50-100 km s −1 ) compared to the galactic-motion velocities (see Figure 6 in Belczynski et al. 2002). This is due to the wide pre-SN orbits and hence low pre-SN binary orbital velocities, which regulate the post-SN systemic velocities and limit them to low values (see limits derived in Kalogera 1996). Also, any eccentricity or high orbital separation imparted by SN1 would likely be mitigated by circularization and inspiral during the common-envelope phase of the companion prior to SN2.
The post-SN orbital properties, assuming the binary has circularized prior to SN2, are derived as in Kalogera (1996): where A pre and A post are the pre-SN and post-SN semimajor axes, e post is the post-SN eccentricity, m 2 is the mass of the neutron born in SN2, m 1 is the mass of the companion neutron star, V rel is the relative velocity between the binary components pre-SN, and V i k are the components of the SN-kick velocity V kick in the frame of the binary, which is centered on the exploding star with the pre-SN objects lying along the x-axis and orbiting in the x-y plane.
The system is initially set on a circular orbit in a random direction about the center of the model galaxy. As we show in Figure 2, pre-SN orbits are essential to include when calculating the trajectory of the binary and constraining kick velocities, as kicks tangential to the galactic orbital velocity cause a slingshot effect, which is much more efficient at propelling the binary to outer regions of the galaxy than a purely radial kick. In addition, the post-SN systemic velocity of the binary depends heavily on the mass-loss kick as well as the SN kick. Therefore, placing true constraints on the SN kick based on the offset of the merger requires knowledge of the magnitude of this mass-loss kick, which is dependent on the progenitor helium-star mass 165 (M He ) and pre-SN semimajor axis as well as the final neutron star mass. In Figure 2, we assume an optimally oriented mass-loss kick that is parallel to the galactic velocity to show the true lower limits on the SN kick as a function of SN2 location, for multiple choices of M He and A pre . By comparing the solid lines, we see that the lower limit of SN kicks is strongly dependent the progenitor properties we assume. Adopting a fiducial value consistent with our constraints (M M 3 He =  and A R 2 pre =  ) we find that 99.95% of BNS systems born within 2 kpc of the galactic center satisfy this lower limit. Furthermore, as all systems above this limit reach the offset of 2 kpc in 10 Myr, which is about two orders of magnitude smaller than the typical delay time, it is necessary to continue the evolution of the binary as it explores the galaxy and possibly crosses the projected offset many times, as discussed in Section 2.5.
Following the computation of the post-SN orbital properties, the effect of the kick is added to the pre-SN systemic velocity. Due to the SN kick and mass loss, the velocity of the exploding star changes by where, again, M He is assumed to leave behind the secondary neutron star component m 2 . Thus, the contribution of the kicks to the post-SN systemic velocity in the center-of-mass frame of the system becomes  . Black points plotted in the background show all sampled systems for various progenitor properties and kick angles as described in Section 2.4; less than 0.05% fall below this limit. The time for systems to reach this offset for various SN-kick velocities is shown by the vertical colored lines. The solid lines to the left and right of the labeled solid line show the tangential SN-kick velocities needed in a more conservative mass-loss scenario, respectively. These cases all represent a lower limit in the true physical distance that systems must travel to reach a projected distance of 2.0 kpc, as the projected distance from the galactic center is always less than the true distance. 165 Just before SN2 the companion to the first neutron star is expected to be the He-rich core of a massive star, stripped of its H-rich envelope because of a prior unstable mass-transfer episode and common-envelope phase. Without such a phase, the binary orbits remain too wide for a BNS system that will merge within a Hubble time to form.
Given the pre-SN properties of the systems involved, the post-SN systemic velocities are comparable to the galactic-motion velocities (see Figure 1).
Before the systemic velocity is added to the pre-SN galactic velocity at a random angle, we check constraints on the post-SN orbital properties to ensure the system remains bound. First, we require that the post-SN orbit passes through the pre-SN positions of the masses (Flannery & van den Heuvel 1975): The mass loss and SN-kick magnitude give upper and lower bounds on the amount of orbital expansion or contraction, imposed as in Kalogera & Lorimer (2000): Finally, the kick velocity is constrained from above by the requirement that the system remains bound, and from below by the minimum kick velocity needed to keep the system intact if more than half the mass of the progenitor is lost in SN2 (Kalogera & Lorimer 2000): He 2

Distributions for Pre-SN Progenitor Properties and SN Kicks
The full 13-dimensional input space from which we sample is where m 1 and m 2 are sampled from the posterior parameters of GW170817 (Abbott et al. 2017b); M He is the progenitor helium-star mass; R gal , gal q , and gal f are the spherical coordinates of SN2 in the galactic frame of reference drawn from the Hernquist stellar profile; gal W indicates the direction of motion of the system about the center of the galaxy just prior to SN2; V kick is the magnitude of the SN-kick velocity imparted on the newly formed neutron star; k q and k f are the angular direction of the kick relative to the plane of the binary; and sys q and sys f are the orientation of the plane of the binary with respect to the galactic coordinates. All angles are sampled isotropically in the sphere. This leaves M He , A pre , and V kick , for which we consider various sampling procedures based on either broad assumptions or observationally motivated distributions. The majority of constraints on SN kicks come from proper motion measurements of pulsars within our galaxy (Gott et al. 1970;Lyne & Lorimer 1994;Kaspi et al. 1996;Arzoumanian et al. 2002;Chatterjee et al. 2005;Hobbs et al. 2005;Verbunt et al. 2017). We adopt the distribution from Hobbs et al. 2005 (Hobbs) as one of our input distributions for V kick : a Maxwellian distribution with a 1D rms σ of 265 km s −1 . However, the mechanisms that impart SN kicks to isolated neutron stars may differ from those imparted to neutron stars that remain bound in BNS systems. There are fewer than 20 known BNS systems in the Milky Way, making inference on SN-kick properties a challenging endeavor. Nonetheless, many studies have been performed to better understand the formation process of these systems, combining the observational data with theoretical modeling (e.g., Willems & Kalogera 2004;Piran & Shaviv 2005;Stairs et al. 2006;Willems et al. 2006;Wong et al. 2010;Osłowski et al. 2011;Beniamini & Piran 2016;Tauris et al. 2017). Comprehensive analyses of observed Galactic BNS systems demonstrate that 3-4 systems  Note. Each parameter is designated as either "Sampled," "Calculated," or "Simulated." Hernquist (Hernquist 1990) is a stellar profile used for elliptical galaxies (Section 2.2). Hobbs (Hobbs et al. 2005 Beniamini & Piran (2016) present a two-population model for this apparent bimodality, differentiating low-kick and highkick Galactic BNSs into two groups based on their observed eccentricity and the rotation period of the pulsar in the system. We use the best-fit parameters from this two-population model (BP16) as another kick prescription from which we sample. Beniamini & Piran (2016) also fit a mass-loss model to their two populations, which is tied to the kick model since systems with lower mass loss are expected to have a smaller shell at the time of SN2 and therefore lower SN kicks. We use this twopopulation model for mass loss as an input distribution for M He , which accompanies the bimodal SN-kick model. Physically, the high-kick model corresponds to SN kicks from a Fe corecollapse SN, whereas the low-kick model is meant to emulate the population of binaries that receive electron-capture SN kicks or SN kicks as an ultra-stripped helium star. For the branching ratio between these two populations, we draw 60% of samples from the low-kick model and 40% from the highkick model, as this is the proportion of Galactic systems that fall into each of these categories (Beniamini & Piran 2016). Finally, we consider an input distribution in SN-kick velocities that is not informed by observations: uniform over the range [0, 2500 km s −1 ] (uniform). Figure 3 shows the input distributions of the three SN-kick models described above.
In addition to the various SN-kick velocity input distributions, we consider multiple different sampling procedures for M He and A pre . For M He , we use a uniform sampling and a power law with an index of −2.35 (Salpeter 1955), ranging from m 2 (i.e., no mass loss) to the nominal black hole limit of 8 M e , along with the two-population maximum-likelihood model for mass loss from Beniamini & Piran (2016). We sample A pre uniform and log uniform from 0.1 R e to 10.0 R e . The ranges for both progenitor masses and semimajor axes are motivated by the studies of Galactic BNS systems (e.g., Wong et al. 2010;Tauris et al. 2017).
We summarize the various parameters in our model and sampling procedures in Table 1. To gauge the impact our input distribution on progenitor constraints, we perform runs in which we alter the input distributions of M He , A pre , and V kick in various ways. We use our least constraining input distribution as our reference: uniform in V kick , uniform in M He , and uniform in A pre . This reference sampling is used for our figures, unless otherwise specified.

Kinematic Modeling
With the above we have all necessary quantities to evolve the binary until merger. We calculate the delay time of the binary as a function of post-SN semimajor axis and eccentricity as in Peters (1964):  where a 0 and e 0 are the initial (post-SN) semimajor axis, and k 0 is determined from the initial semimajor axis and eccentricity of the system. After the binary has evolved for time T delay , we determine the offset of the binary from the center of the galaxy by projecting the system onto the x-y plane in galactic coordinates (i.e., we assume the observer is looking at NGC 4993 down the galactic z-axis). If the binary ends at an offset between 1.8 and 2.2 kpc and merges in less than a Hubble time, it is considered a GW170817-like system.
We initially take a simplistic approach and assume that all binaries with delay times less than a Hubble time are valid GW170817 analogs. We then consider a full range of possible stellar-population ages for NGC 4993, from as old as the age of the universe to as young as the present. Further discussion on the star formation history of NGC 4993 is found in Section 4. We also vary the projected offset of GW170817, as if it were not known, to investigate how constraints on progenitor properties change as systems are discovered further from their host galaxies.

Results
Our main results comprise constraints on pre-and post-SN binary properties and SN-kick velocities, which also determine how long each binary lives between SN2 and its GW-driven inspiral and merger. In Figure 4, we show a variety of galactic orbits that potential GW170817 progenitors follow in their host galaxy, depending on post-SN properties and associated delay times. Delay times much longer than the dynamical timescale of the galaxy (;20 Myr at 2 kpc) typically lead to progenitors exploring most of the galaxy kinematically despite the merger happening relatively close to the galactic center. Shorter delay times typically lead to simple orbits of minimal structure, facilitating nearby BNS birth and merger locations, although not always (see, for example, the bottom middle panel of Figure 4).
The T delay times are effectively coupled to the star formation history of NGC 4993, which prior to GW170817 was not well studied. These values are indicative of how long ago SNe typically occurred, and therefore mark the ages of the most dominant stellar populations in this galaxy. In the analysis of our results, we consider a range of different T delay constraints and assess the sensitivity/robustness of derived constraints on progenitor properties to assumptions about the stellar age of NGC 4993, i.e., T delay of GW170817-like progenitors. Though the projected offset of the optical counterpart to GW170817 was well constrained, we also consider our results' robustness against this location constraint. Last, we explore different assumed distributions for the initial progenitor properties and SN kick, and we assess the robustness of our results against such changes.
The main results are presented in Figure 5, for our fiducial simulation where we assume uniform distributions for all input parameters (see Section 2.4). For the progenitor populations in the top row, we examine probability density functions (PDFs) on GW170817 progenitor properties when we impose the projected distance offset constraint of 2.0±0.2 kpc, and different lower limits on the T delay . It is remarkable that, provided the stellar population in NGC 4993 is older than 1 Gyr, the progenitor constraints are highly robust. We also find this insensitivity to the fine details of stellar ages to be true for our other input distributions and when we constrain T delay to specific ranges rather than imposing lower limits. Only if T delay values shorter than 1 Gyr are allowed (i.e., recent star formation has persisted in the host galaxy) are the constraints on the SN Figure 5. Constraints on progenitor properties, SN-kick velocities, and the location of SN2 for various assumed delay times and projected offsets. All plotted lines are kernel density estimates (KDEs) of the recovered distributions, and distributions are normalized over the full range of sampling for a given parameter; vertical axes labels are omitted for readability. In the top row, we set lower limits to the delay times of systems and identify those that match the projected offset of GW170817. As T delay is coupled to the star formation history of NGC 4993, this has the effect of constraining the simulated stellar population of NGC 4993 to older ages. Sampled distributions are shaded in gray for reference. The middle row shows normalized distributions of binaries that survive SN2 (red) and merge at a projected offset of 2.0±0.2 kpc (green; light green shows the histogram of samples to compare with the KDEs). In the bottom row, we investigate how the projected offset of a hypothetical merger similar to GW170817 affects inference on progenitor properties and SN kicks. In the middle and bottom rows, we assume that GW170817 arose from a stellar population older than 1 Gyr. kick and the pre-SN semimajor axis strongly affected: shorter time delays imply tighter post-SN BNS systems which allow for tighter pre-SN binaries that can remain bound even with higher SN-kick magnitudes. Delay times shorter than 1 Gyr also produce a very sharp peak in the galactic radius of SN2 (R SN ) around the merger distance, as the progenitor population becomes dominated by binaries that are born as BNSs relatively close to their merger site with short T delay . To summarize, for delay times greater than 1 Gyr, the median values and 90% ranges for our reference sampling are: kpc for the birth radius away from the galaxy center. More detailed results examining additional parameters and parameter correlations can be found in Figure 8 in the Appendix.
In addition to SN-kick velocities, we examine constraints on the post-SN systemic velocities (V sys ). We find somewhat tighter constraints on V sys for GW170817-like binaries, peaking at ;250 km s −1 and with 90% of systems below ;400 km s −1 when we constrain the population to T 1 Gyr delay  . Tighter constraints are to be expected as the systemic velocities are limited by the requirement that the post-SN binary remains bound; as a result, the systemic velocities saturate at values of about 1.5-2 times the pre-SN relative orbital velocities (see Kalogera 1996 for the analytical derivation of upper limits). We again find that V sys is robust to age constraints; the PDFs on V sys are practically identical provided the stellar population is 1 Gyr.
In the middle and bottom rows of Figure 5, we examine how significant of a constraint is the knowledge of the merger's offset from the galaxy's center. Results in the middle row demonstrate that the primary origin of our constraints on SN kicks and progenitor properties stems from the requirement that systems remain bound after the explosion. Higher kicks, more massive helium-rich progenitors, and wider pre-SN orbits tend to disrupt a larger fraction of systems. We also find that any offset constraint at all differentiates the R SN distributions between SN survivors and GW170817-like systems the most: remaining bound post-SN is not affected by galactic location and without the offset, of course, progenitors follow the galaxy mass distribution. Imposing an offset constraint limits the birth radius to within a factor of typically ∼2-3 from the offset. The relatively small offset from the galaxy center shifts the SN kicks and helium-star masses to smaller values, effectively reducing the BNS post-SN systemic velocities, while it leaves the constraints on A pre unaffected.
We further explore the robustness of our results on the assumed input distributions for V kick , M He , and A pre , again adopting the merger projected offset constraint of 2.0±0.2 kpc. Specifically in Figure 6, we show our results for the three different SN-kick distributions (see Section 2.4). We choose only two cases of T delay constraints given the robustness of our results demonstrated in Figure 5: (i) no constraint (top row), i.e., star formation has continued in this galaxy up until the present, and (ii) T delay > 1 Gyr (bottom row), i.e., the stellar population in NGC 4993 is older than 1 Gyr. It is evident that the constraints on M He , A pre , and R SN are robust against these different SN-kick assumptions. However, the robustness against different kick assumptions comes with the corollary that the data from this one observation is not extremely informative on the true underlying SN-kick distribution. In general, we see that GW170817 constraints exhibit mild sensitivity to the input SN-kick distributions. The uniform and Hobbs sampling procedures tend to shift to smaller SN-kick magnitudes but by relatively small amounts. The behavior with the BP16 assumption is different, but not surprising: the BP16 input distributions are extremely narrow and prescriptive, strongly dictating the allowed SN kicks and progenitor masses. The constraints for A pre and R SN are slightly stronger than those for V kick and M He ; A pre is strongly influenced by the limits placed on delay times, and R SN by the offset of GW170817 with respect to the galactic center.
Lastly, we have performed additional simulations with varying assumptions about the neutron star mass posteriors (Section 2.1) and the galaxy parameters (Section 2.2). To test the sensitivity of our results to neutron star mass measurements, we sampled the high-spin prior (a 0.89 < ) component mass posteriors, which have a much broader range of 1.36-2.26 M e and 0.86-1.36 M e for the primary and secondary components masses, respectively (Abbott et al. 2017b). We find quantitatively insignificant differences in our progenitor constraints. We also assess the robustness of our results against variations in the measured properties of the galaxy (i.e., stellar and DM halo masses, effective radius) and find insignificant changes for variations up to ∼30%.
We quantitatively test the robustness of our results against all assumption variations by calculating the Kullback-Leibler (KL) divergence (Kullback & Leibler 1951), described in more detail in the Appendix. The KL results, as well as the median and 90% credible intervals on all progenitor parameters and all input distributions are reported in Table 2 in the Appendix, and quantitatively justify our statements on insensitivity and robustness. The median values for the progenitor masses and semimajor axes are mostly consistent with favored values found with forward population synthesis of binary evolution (K. Breivik 2017, private communication).
Prior to SN2, the helium star may have been overflowing its Roche lobe and transferring mass to its neutron star companion. If so, the helium star could have lost significant amounts of mass ( M 1   ) prior to its explosion (Pols & Dewi 2002;Ivanova et al. 2003). To investigate the possibility of Roche-lobe overflow (RLO) at the time of the SN2, we examine whether the properties of GW170817 progenitors satisfy the conditions for RLO, adopting the analytical fit for the helium-star radius from Kalogera & Webbink (1998). Figure 7 plots successful binaries on progenitor A M pre He space, indicating those that would have been in an RLO phase at SN2 in green. We see that a significant fraction (;46%, assuming uniform input distributions as described in Section 2.4) of the GW170817 progenitor systems may have been undergoing RLO at the time of the BNS formation. This is not a major surprise, as it is well established by several independent studies that the double pulsar (and other known BNS systems) was also in an RLO phase at the time of SN2 (

Discussion and Conclusions
In the modeling analysis presented here, we focus on constraining the immediate progenitor of GW170817, from its actual formation at the time of the second SN to the final merger. We use (i) SN-kick dynamics and kinematic modeling within the host galaxy and (ii) the GW-measured neutron star masses, the identification of the source host galaxy, and its projected distance offset from the galactic center based on the early optical discoveries. We make the most minimal/agnostic assumptions possible and avoid full, high-fidelity population synthesis models, which can account for the complex binary evolution before SN2. We explore the robustness of our results for different input assumptions.
In our analysis, we assume that the GW170817 progenitor evolved as an isolated binary in the galaxy's field population. There are no reported results regarding observations of globular clusters (GCs) in NGC 4993, so the number of GCs in the galaxy is not known. Given that NGC 4993 could have a sizable population of GCs, a dynamical formation channel for the coalescing BNSs cannot be ruled out a priori. Typically, the number of observed GCs in a galaxy correlates with the luminosity of the galaxy (Barr et al. 2007;Harris 2016). This observed correlation can be used to estimate the number of GCs in NGC 4993. With an apparent V-band magnitude of 12.4 mag (Bellini et al. 2017) and a distance of 40 Mpc, we find an absolute V-band magnitude of −20.6 mag, which for an E/S0-type galaxy would correspond to 250 150 750 -+ GCs. However, since in general GCs comprise only a small fraction of the total mass of the galaxy(∼0.01%-0.1%; Harris et al. 2015) and estimated observational merger rates for BNSs originating from Figure 7. Presence of RLO in progenitor systems prior to SN2. Only systems that produced successful analogs of GW170817 are plotted. Green indicates that the system was in RLO prior to SN2, and blue indicates that the system was at a large enough pre-SN semimajor axis to not be experiencing RLO prior to SN2.
GCs are low (Grindlay et al. 2006;Ivanova et al. 2008), the isolated formation scenario is preferred. Bae et al. (2014), for example, estimated detection rates for LIGO-Virgo of 0.024 to 0.1 events per year for M M 1.4 1.4   -BNSs coming from GCs (assuming a design-sensitivity BNS range of 200 Mpc for the LIGO detectors), which gives ∼10 −4 -10 −2 events per year given the sensitivity at the time of detection (∼50, 100, and 25 Mpc for Hanford, Livingston, and Virgo, respectively). Such a low rate estimate is in contrast with the rate implied by the current GW discovery, and therefore we consider it unlikely that GW170817 was formed in a GC.
We can use the current knowledge from just this one BNS detection in GWs to extract a first estimate of the BNS formation efficiency: the fraction of massive binaries that become merging BNS systems. Our GW data analysis has yielded a measurement of the BNS rate density of 320-4740 Gpc −3 yr −1 (Abbott et al. 2017b), consistent with the measurements from radio-pulsar observations (e.g., Abadie et al. 2010). Given the volume density of Milky Way-like galaxies in the local universe (i.e., galaxies of comparable mass to the Milky Way) this rate measurement translates to 32-474 Myr −1 per Milky Way equivalent galaxy (Abadie et al. 2010). NGC 4993 is of a transitional galaxy type rather than a spiral galaxy. We therefore use the galaxy's stellar mass (instead of their blue luminosities as traditionally done for starforming galaxies) to scale the BNS rate volume density to a rate for this specific galaxy. For the Milky Way and NGC 4993 the masses are very comparable: both have approximately 60 billion solar masses in stars (Licquia & Newman 2015;Lim et al. 2017), and therefore the BNS rate estimates based on stellar mass are roughly comparable. Assuming a Salpeter-like initial mass function (Salpeter 1955) we find that NGC 4993 has formed 4 10 8´b inaries with stars whose initial masses are greater than 5 M e . For a typical merger delay time (which dominates the lifetime of a BNS system) of 4 Gyr (see Table 2 in the Appendix) we calculate that NGC 4993ʼs efficiency in forming BNS merger systems per number of massive binaries is in the range of ∼(1-50) × 10 −4 .
We model NGC 4993 with a spherically symmetric stellar and dark-matter halo profile. We note that the galaxy type is E/S0, an intermediate morphology between spiral and elliptical galaxies, and it can possibly retain a disk structure component instead of a pure spherical, radial profile (Lambas et al. 1992). Most recently, however, Im et al. (2017) show that NGC 4993 is dominated by its bulge, further supporting our assumption of a spherical gravitational potential. In addition, NGC 4993 has an axis ratio of ;0.9 (Crook et al. 2007), which is consistent with a nearly spherical elliptical galaxy (though it does not exclude a face-on disk). In our analysis, we also adopt circular orbits for the galactic motion of the progenitors prior to SN2, even though there are more complex orbits allowed in realistic potentials expected for galaxies like NGC 4993 (e.g., box orbits). The key effect of including the galactic orbits is simple: the pre-SN progenitor was already in motion with orbital velocities of hundreds of km s −1 , which is comparable to the systemic post-SN velocities of the source. The specific shape of the galactic orbits or of the gravitational potential does not appear to be of particular importance, as our constraints are primarily dictated from the necessity that the system remain bound prior to SN2. This assertion is further supported by the fact that our quantitative constraints for progenitor properties are comparable to those found for BNS systems in the Milky Way where the galactic potential of a spiral galaxy is used instead.
In conclusion, we use a minimal set of observational information to constrain GW170817ʼs immediate progenitor, the SN kick imparted to the second neutron star, and its birth location in NGC 4993 with an appropriate galaxy model and the merger offset both informed by photometry. We obtain relatively robust constraints on the progenitor properties, albeit not always tight, strongly influenced by the requirement of keeping the binary bound after the SN and having the merger occur relatively close to the center of NGC 4993. The GW170817 progenitor constraints derived in this study are in good agreement with the progenitor constraints derived for the Galactic BNS systems as well (e.g., Wong et al. 2010;Tauris et al. 2017).
It is important to note that these constraints are essentially unchanged provided the stellar populations in NGC 4993 are older than 1 Gyr. The current literature on NGC 4993 does not provide quantitative information on the galaxy's star formation history. Recent observations(e.g., Foley et al. 2017) might indicate some star formation activity, but as an E/S0 galaxy, it is unlikely that GW170817 was the result of very recent star formation (DeGraaff et al. 2007). Additionally, observations from Im et al. (2017) conclude that the stellar population in NGC 4993 is older than 3 Gyr. Our results strongly indicate that, for a small projected offset like that of GW170817, knowledge of the precise star formation history of the host galaxy is not vital in further constraining SN kicks and progenitor properties.
As more EM counterparts to BNS mergers are identified, we will add to the current sample of BNS systems from the Milky Way and inferred from extragalactic sGRB offset measurements to advance our constraints on progenitor properties. We note that larger projected offsets from the host-galaxy center may provide stronger constraints on the SN-kick magnitudes. In such cases, information on the age of the host-galaxy stellar population, and therefore on the BNS inspiral time, may become more useful. We provide more detailed PDFs on progenitor properties inferred from GW170817-like systems with different delay time constraints (Figure 8) and summary statistics for output PDFs ( Table 2). In Table 2, we include quantities that measure the degree to which the PDFs change when we constrain the delay times and require that the binaries match the observed offset of GW170817. This PDF comparison is done using the KL divergence (Kullback & Leibler 1951): , where we take Q to be the samples prior to applying a constraint (i.e., delay time or correct offset) and P to be the samples post-application of the constraint. Figure 8. Marginalized and joint PDFs on progenitor system properties V kick , M He , A pre , R SN , and R merger . We restrict GW170817-like systems to various lower limits for T delay . The black points show the full population of binaries that correspond to the measured offset of GW170817 (i.e., they have no constraint on T delay ). As delay times become 1 Gyr, the constraints on GW170817-like samples are significantly tightened, in particular, removing systems with low A pre and thus short inspiral times, systems with extremely high SN-kick velocities, and systems that are born as BNSs and quickly merge right at the offset of GW170817. The diagonal line in the joint R R merger SN -PDF, for example, is an artifact of extremely short inspiral times leading to BNS systems merging at the same location as the second supernova.
Specifically, the KL divergence measures the information gained by updating a prior Q to a posterior P. The values are computed by histogramming the samples from Q to approximate q(x) and using the same bin locations to make a histogram of p(x). The integral for KL divergence then becomes analytic. Note. Reported values are the median and 90% confidence interval. The letters in the first three columns indicate the input sampling used: uniform (U), Hobbs (H), log uniform (L), power law (PL), and BP16 (BP); see Section 2.4 for more details. KL divergence scores (in units of Nats) are reported in the four right-most columns, quantifying information gained by imposing constraints (and by proxy, how much the PDFs change). "Offset" quantifies the amount of information learned by taking all binaries that survive the second supernova and imposing the constraint that they merge at the correct projected offset, using the (V kick , M He ) joint PDF. The remaining three KL values take all post-SN binaries with the correct offset and sampling method indicated by the first three columns of a given row and restrict to those with a T delay range specified by that row. These compare the 2D PDFs indicated by the column headers. The rows with T 0, 14 Gyr delay Î [ ] have no age constraints, and thus their age-restricted KL divergence will always be zero (they are being compared to themselves). We have labeled these as "ref" instead of zero, to make this clear. As a rough rule of thumb, KL values 0.1  correspond to small differences in the distributions, ∼0.4-0.6 to modest differences, and 1.0 to quite large differences. In general, we find that we learn slightly more by imposing the offset constraint if the age constraint is tighter. We also consistently get higher KL values when A pre is included in the analysis, which is due to its strong correlation with T delay . Input distributions are described in detail in Section 2.4.