X-Pipeline: An analysis package for autonomous gravitational-wave burst searches

Autonomous gravitational-wave searches -- fully automated analyses of data that run without human intervention or assistance -- are desirable for a number of reasons. They are necessary for the rapid identification of gravitational-wave burst candidates, which in turn will allow for follow-up observations by other observatories and the maximum exploitation of their scientific potential. A fully automated analysis would also circumvent the traditional"by hand"setup and tuning of burst searches that is both labourious and time consuming. We demonstrate a fully automated search with X-Pipeline, a software package for the coherent analysis of data from networks of interferometers for detecting bursts associated with GRBs and other astrophysical triggers. We discuss the methods X-Pipeline uses for automated running, including background estimation, efficiency studies, unbiased optimal tuning of search thresholds, and prediction of upper limits. These are all done automatically via Monte Carlo with multiple independent data samples, and without requiring human intervention. As a demonstration of the power of this approach, we apply X-Pipeline to LIGO data to search for gravitational-wave emission associated with GRB 031108. We find that X-Pipeline is sensitive to signals approximately a factor of 2 weaker in amplitude than those detectable by the cross-correlation technique used in LIGO searches to date. We conclude with the prospects for running X-Pipeline as a fully autonomous, near real-time triggered burst search in the next LSC-Virgo Science Run.


I. INTRODUCTION
Gravitational-wave bursts (GWBs) are one of the most interesting classes of signals being sought by the new generation of gravitational-wave detectors. Possible sources include core-collapse supernovae [1], the merger of binaries containing black-holes or neutron-stars [2], gammaray bursts [3], and other relativistic systems; see [4] for a brief overview. These systems typically involve matter at neutron-star densities and very strong gravitational fields, making GWBs potentially rich sources of information on relativistic astrophysics.
The maximum exploitation of a GWB detection would occur when the system is observed by other "messengers" besides gravitational waves, such as in optical, gamma rays, or neutrinos [5]. Indeed, the first detection of a GWB might rely on independent confirmation by other observatories, and efforts are underway to develop collaborations between gravitational-wave detectors, electromagnetic telescopes, and neutrino observatories (see for example [6,7]). The rapid and confident identification of candidate GWBs by gravitational-wave detectors will be vital for these efforts. a Electronic address: patrick.sutton@astro.cf.ac.uk Unfortunately, the analysis of gravitational-wave data tends to be a slow process, with a typical latency of several years between the collection of the data and the publication of results. For example, searches for gravitational-wave transients in the first year (2005)(2006) of the LIGO Science Run 5 / Virgo Science Run 1 (S5-VSR1) have only recently been published [8,9]. One of fastest such analyses has been the search for a gravitational-wave signal associated with GRB 070201 [10], which was published 9 months after the event.
The rapid analysis of gravitational-wave data is not trivial, particularly given the non-stationary nature of the background noise in gravitational-wave detectors and the lack of accurate and comprehensive waveform models for GWB signals. Specifically, we need methods capable of detecting weak signals with a priori unknown waveforms, yet which are simultaneously insensitive to the background noise "glitches" that are common in data from gravitational-wave detectors. Glitch rejection is particularly important since it is the limiting factor in the sensitivity of current burst searches, and a confident detection of a GWB will depend critically on robust background estimation. Detector characterisation [11,12] and search optimization tend to be laborious and time-consuming, as is accounting for other systematic effects such as uncertainties in detector calibration.
These considerations motivate the deployment of data arXiv:0908.3665v2 [gr-qc] 7 Apr 2010 analysis packages that can process data rapidly, yet comprehensively. The ideal scenario is a fully autonomous search -one that runs continuously and without human intervention. This requires an analysis that is self-tuning, adjusting search parameters to changes in the detector network and accounting for variations in the properties of the background noise around the time of candidate events.
We present X-Pipeline [13,14], a software package designed for autonomous searches for unmodelled gravitational-wave bursts. X-Pipeline targets GWBs associated with external astrophysical "triggers" such as gamma-ray bursts (GRBs), and has been used to search for GWBs associated with more than 100 GRBs that were observed during S5-VSR1 [15]. It performs a fully coherent analysis of data from arbitrary networks of gravitational-wave detectors, while being robust against noise-induced glitches. We emphasize the novel features of X-Pipeline, particularly a procedure for automated tuning of the background rejection tests. This allows the analysis of each external trigger to be optimized independently, based on background noise characteristics and detector performance at the time of the trigger, maximizing the search sensitivity and the chances of making a detection. This tuning uses independent data samples for tuning and estimating the significance of candidate events, for unbiased selection of GWB candidates. (See also [16] for a Bayesian-inspired technique for automated tuning.) X-Pipeline can also account automatically for effects like uncertainty in the sky position of astrophysical trigger and detector calibration uncertainties. Furthermore, for the ongoing S6-VSR2 run, we are preparing the next step in the evolution of GWB searches: a fully autonomous search, wherein X-Pipeline is triggered automatically by email reports of GRBs, and wherein data is analysed and candidate GWBs identified without human intervention. Our goal is the complete analysis of each GRB within 24 hours of the receipt of the GRB notice. Such a rapid analysis would be fast enough to allow further follow-up observations to be prompted by the GWB candidate.
We begin in Section II with a brief discussion of the theory of coherent analysis in gravitational-wave burst detection. In Section III we discuss the main steps followed in an X-Pipeline triggered coherent search. In Section IV we demonstrate the sensitivity of X-Pipeline on GRB 031108 using actual LIGO data, and compare to the upper limits set by the cross-correlation technique used in the published LIGO search for gravitational waves associated with the same GRB. In Section V we discuss the status of autonomous running of X-Pipeline during the current S6-VSR2 science run of LIGO and Virgo. We conclude with a few brief comments in Section VI.

II. COHERENT ANALYSIS FOR GRAVITATIONAL-WAVE BURST DETECTION
Most algorithms currently used in gravitational-wave burst detection can be grouped into two broad classes. In incoherent methods [17,18], candidate events typically are constructed from each detector data stream independently, and one looks for events with similar duration and frequency band that occur in all detectors simultaneously. By contrast, coherent methods [14,17,[19][20][21][22][23][24][25][26][27][28][29][30][31][32][33] combine data from multiple detectors before processing, and create a single list of candidate events for the whole network. Coherent methods have some advantages over incoherent methods, such as demonstrated usefulness in rejecting background noise "glitches" [14,28,33], and for reconstructing GWB waveforms [19,32]. A lessrecognized advantage of coherent methods is that they are relatively easy to tune. For example, time-frequency coincidence windows for comparing candidate GWBs in different detectors are not necessary. Detectors are naturally weighted by their relative sensitivity, so there is no need to tune the relative thresholds for generating candidate events in each detector. This ease of tuning makes coherent methods particularly useful for rapid searches.
That said, there are also draw-backs to coherent methods, the most significant being computational cost. Coherent combinations are typically a function of the sky position of the GWB source; there are > ∼ 10 3 resolvable directions on the sky for a worldwide detector network [34]. This cost is compounded by the need to estimate the background due to noise, which requires repeated reanalysis of the data using time shifts. Fortunately, in triggered searches the sky position of the source is often known to high accuracy, and the amount of data to be analysed is relatively small (typically hours), so the computational cost of a fully coherent analysis is modest. This allows triggered searches to take advantage of the benefits of coherent methods while avoiding or minimizing most of the drawbacks.
In this section we give a brief review of some of the main principles of coherent network analysis as implemented in X-Pipeline.

A. Formulation
A rigorous treatment of gravitational waves is based on linearized perturbations of the spacetime metric around a fixed background (see for example [35]). In the linearized theory based on flat spacetime, when working in a suitable gauge, the perturbations representing the gravitational waves can be shown to obey the ordinary wave equation. The gravitational waves are transverse, and travel at the speed of light. They have two independent polarizations, commonly referred to as "plus" (+) and "cross" (×). Their physical manifestation is a quadrupolar change in the distance between freely falling test particles (approximated in interferometric gravitational-wave detectors by the mirrors in the interferometer arms). Explicit definitions of the plus and cross polarization states can be found, for example, in [17].
The interferometers currently used to try to detect these waves are based on a laser, beamsplitter, and mirrors at the ends of each arm which serve as test masses. Data from each interferometer record the length difference of the arms and, when calibrated, measure the strain induced by a gravitational wave. The LIGO detectors are kilometer-scale power-recycled Michelson interferometers with orthogonal Fabry-Perot arms [36,37]. There are two LIGO observatories: one located at Hanford, WA and the other at Livingston, LA. The Hanford site houses two interferometers: one with 4 km arms (H1), and the other with 2 km arms (H2). The Livingston observatory has one 4 km interferometer (L1). The Virgo detector (V1) is in Cascina near Pisa, Italy. It is a 3 km long power-recycled Michelson interferometer with orthogonal Fabry-Perot arms [38]. The GEO 600 detector [39], located near Hannover, Germany, is also operational, though with a lower sensitivity than LIGO and Virgo. These instruments are all designed to detect gravitational waves with frequencies ranging from ∼ 30 Hz to several kHz.
Consider a gravitational wave h + (t, x), h × (t, x) from a directionΩ. The output of detector α ∈ [1, . . . , D] is a linear combination of this signal and noise n α : (2.1) Here F + (Ω), F × (Ω) are the antenna response functions describing the sensitivity of the detector to the plus and cross polarizations (note that the choice of polarization basis is arbitrary; we use the ψ = 0 choice of Appendix B of [17]). Also, ∆t α (Ω) is the time delay between the position r α of detector α and an arbitrary reference position r 0 : For brevity, we suppress explicit mention of the time delay and understand the data streams to be time-shifted by the appropriate amount prior to analysis. We also write h +,× (t) ≡ h +,× (t, r 0 ). Since the detector data is sampled discretely, we use discrete notation henceforth. The discrete Fourier- where N is the number of data points in the time domain. Denoting the sampling rate by f s , we can convert from continuous to discrete notation using For example, the onesided noise power spectral density S α [k] of the noiseñ α is where the angle brackets indicate an average over noise instantiations. It is conceptually convenient to define the noisespectrum-weighted quantities , , (2.6) . (2.7) The normalization of the whitened noise is [51] ñ where we have dropped the explicit indices for frequency and sky position. We use the boldface symbolsd, F ,ñ to refer to noise-weighted quantities that are vectors or matrices on the space of detectors (note thath is not noise-weighted and is not in the space of the detectors): and (2.11) (See Table I for a list of the dimensions of all of the quantities used in this section.) Note that each of these quantities is a function of both frequency and (through the antenna response or implied time shift) sky position. As a consequence, coherent combinations typically have to be re-computed for every frequency bin as well as for every sky position. Note also that, because of the noise-spectrum weighting, the whitened noise is isotropically distributed in the space of detectors [equation (2.8)]. Therefore, all information on the sensitivity of the network both as a function of frequency and of sky position is contained in the matrix F defined by equation (2.11).

B. Standard Likelihood
In this section we describe some of the simpler coherent likelihoods: those that can be computed from projections of the data. These are the main ones used for signal detection in X-Pipeline. We begin with the simplest coherent likelihood of all: the standard or maximum likelihood, first derived in [17,20].
Let P (d|h) be the probability of obtaining the whitened datad in one time-frequency pixel in the presence of a known gravitational waveh from a known direction. Assuming Gaussian noise, For a set {d} of N p time-frequency pixels, (2.13) where k indexes the pixels. The likelihood ratio L is defined by the log-ratio of this probability to the corresponding probability under the null hypothesis, (2.14) where P ({d}|{0}) is the probability of measuring the data {d} when no GWB is present (h = 0).
In practice, the signal waveformh is not known a priori, so it is not clear how to compute the likelihood ratio (2.14). One approach is to treat the waveform values h = (h + ,h × ) in each time-frequency pixel as free parameters to be fit to the data. The best-fit valuesh max are those that maximize the likelihood ratio: Because the likelihood ratio L is quadratic inh, (2.15) gives a linear equation forh max . The solution is where we use † to denote the conjugate transpose. (F is real, but other quantities such as the data vectord are complex.) Substituting the solution forh max in (2.14) gives the standard likelihood, where we define and we have used the fact that P GW is Hermitian. (The factor of 2 in the definition of E SL is purely a matter of taste.)

C. Projection Operators and the Null Energy
It is easy to show that P GW is a projection operator that projects the data into the subspace spanned by F + and F × . We know by equation (2.1) or (2.9)-(2.11) that the contribution tod by any gravitational wave from a fixed sky position is restricted to this subspace. The standard likelihood is therefore the maximum amount of energy [52] in the whitened data that is consistent with the hypothesis of a gravitational wave from a given sky position.
Contrast this with the total energy in the data, which is simply (2.19) The total energy is an incoherent statistic in the sense that it contains only autocorrelation terms and no crosscorrelation terms. In the limit of a one-detector network, this is the quantity one computes for each time-frequency pixel in an excess-power search [17]. The projection operator P null ≡ (I − P GW ), which is orthogonal to P GW , cancels the gravitational-wave signal. This yields the null stream with energy (2.20) The null energy is the minimum amount of energy in the whitened data that is inconsistent with the hypothesis of a gravitational wave from a given sky position.
One advantage of coherent analysis is that the projection from the full data space with energy E tot to the subspace spanned by F + , F × with energy E SL removes some fraction of the noise, with energy E null , without removing any of the signal component (small errors in calibrations, sky position, or power spectra change F but this affects the signal energy only at second order). This means that a signal can be detected with higher confidence. An important caveat is that the full benefit is gained only if the sky position is known a priori, such as in gamma-ray burst searches. If the sky position of the source is not known a priori, one typically repeats the calculation of the likelihood for a set of directions spanning the entire sky ( > ∼ 10 3 ) directions). Since F + , F × vary with the sky position, this means that many different projection operators will be applied to the data. This will incur a false-alarm penalty.

D. Dominant Polarization Frame and Other Likelihoods
For a single time-frequency pixel, the data from a set of D detectors is a vector in a D-dimensional complex space. One basis of this space is formed by the set of single-detector strains (the basis in which all equations have been written thus far); however, this is not the most convenient basis for writing detection statistics. The 2dimensional subspace defined by F + , F × is a natural starting point for the construction of a better basis. If we examine the properties of this 2-dimensional space, we find there is a direction (a choice of polarization angle) in which the detector network has the maximum antenna response, and an orthogonal direction in which the network has minimum antenna response. Choosing those two directions as basis vectors, and completing them with an orthonormal basis for the null space, yields a very convenient basis in which to construct detection statistics. To further simplify things it is possible to define the +, × polarizations so that F + lies along the first basis vector, and F × along the second. This choice of polarization definition is called the dominant polarization frame or DPF [21,22]. Note that while searches for modeled signals such as binary inspirals often select the polarization basis with reference to the source, the DPF polarization basis is tailored to the detector network at each frequency. This makes it a particularly convenient choice when searching for more general gravitational-wave burst signals.
To see how one constructs the DPF, recall that the antenna response vectors in two frames separated by a polarization angle ψ are related by [17]). It is straightforward to show that for any direction on the sky, one can always chose a polarization frame such that F + (ψ) and F × (ψ) are orthogonal and |F + (ψ)| > |F × (ψ)|. Explicitly, given F + (0), F × (0) in the original polarization frame, the rotation angle ψ DP giving the dominant polarization frame is where atan2(y, x) is the arctangent function with range (−π, π]. Note that ψ DP is a function of both sky position and frequency (through the noise weighting of F + and F × ). We denote the antenna response vectors in the DPF by the lower-case symbols f + , f × . They have the properties In the DPF the unit vectors e + ≡ f + /|f + |, e × ≡ f × /|f × | are part of an orthonormal coordinate system; see Figure 1. Indeed, the DPF can be viewed as the natural coordinate system in the space of detector data for understanding the sensitivity of the network. Mathematically, rotating to the DPF is the same as doing a singular value decomposition of the matrix F . The singular values are |f + | 2 and |f × | 2 ; i.e., the magnitudes of the antenna response evaluated in the DPF. It should be noted that the DPF does not specify any particular choice of basis for the null space. Convenient choices for the null basis can be motivated by how the null energy is used in the search, but we do not consider this issue here.
In the DPF, the projection operator P GW takes on the very simple form (2.26) The standard likelihood (2.17) becomes where we use the notation a · b to denote the familiar dot product between D × 1 dimensional vectors a and b. The plus energy or hard constraint likelihood [21,22] is the energy in the h + polarization in the DPF: (2.28) The cross energy is defined analogously: (2.29) The soft constraint likelihood [21,22] (not a projection likelihood) is where the weighting factor is defined in the DPF as Numerous other likelihood-based coherent statistics have been introduced in the literature, such as the Tikhonov regularized statistic [24], a sky-map variability statistic [29], and modified constraint likelihood statistics [25]. Also, comprehensive Bayesian formulations of the problem of GWB detection and waveform estimation are described in [26,27,32]. While some of these statistics are available in X-Pipeline, we do not consider them here.

E. Statistical Properties
One convenient property of the projection likelihoods E + , E × , E SL , E null , E tot is that their statistical properties for signals in Gaussian background noise are very simple. Specifically, for a set of time-frequency pixels and a sky position chosen a priori, each of these energies follows a χ 2 distribution with 2N p D proj degrees of freedom: Here N p is the number of pixels (or time-frequency volume). D proj is the number of dimensions of the projection, which is 1 for E + , E × , 2 for E SL , and D for E tot . D proj = D − 2 for E null , except when the null stream is constructed as the difference of the data streams from the two co-aligned LIGO-Hanford detectors, H1 and H2, in which case it is D−1 (the H1-H2 sub-network is only sensitive to a single gravitational-wave polarization, so only one dimension is removed in forming the null stream). The factor of 2 in the degrees of freedom occurs because the data are complex. The non-centrality parameter λ is the expected squared signal-to-noise ratio of a matched filter for the waveform restricted to the time-frequency region in question [53] and after projection by the appropriate likelihood projection operator, summed over the network: Note that in (2.33) and (2.34) the antenna responses and waveforms are defined in the DPF. Eqn. (2.35) is actually independent of the polarization basis used. The mean and standard deviation of the non-central χ 2 distribution (2.32) are (2N p D proj + λ) and 2N p D proj . Consequently, one expects a signal to be detectable by a given coherent statistic when Table II shows the mean and standard deviation of various energy measures when the correct sky position and the time-frequency region are known a priori. For a circularly polarized or unpolarized gravitational wave, ρ 2 × /ρ 2 + 1 for typical sky positions. For example, for the LIGO-Virgo network of detectors H1-H2-L1-V1, assuming H2 is half as sensitive as H1, L1, and V1, the median value of is 0.1, while for the LIGO network H1-H2-L1 the median is 0.02. As a consequence, for many signals ρ 2 × is negligible. (An exception is linearly polarized GWBs; for these the random polarization angle can make ρ 2 × > ρ 2 + in the H1-H2-L1 network for approximately 10% of signals for a typical sky position.) Since all of the energies except E × in Table II include ρ 2 + , their relative performance is dominated by the level of noise fluctuations. The noise fluctuations in the energies scale as the square-root of the number of orthogonal directions used to compute the energy. As a consequence, we expect those statistics that project the data down to fewer dimensions to perform better for GWB detection. For E + the data is projected onto a single direction. E SL and E soft use data along two directions, and so have higher noise. The total energy E tot uses all of the data and therefore incorporates the largest contributions from noise. In practice, coherent consistency tests (discussed in the next section) can be used to reduce the noise background, allowing statistics like E SL to be used effectively, so that all of the signal-to-noise ratio of a GWB (ρ 2 + and ρ 2 × ) can be included in the detection statistic.

F. Incoherent Energies and Background Rejection
The various likelihood measures E SL , E + , etc. are motivated as detection statistics under the assumption of stationary Gaussian background noise. Real detectors do not have purely Gaussian noise. Rather, real detector noise contains glitches, which are short transients of excess strain that can masquerade as gravitational-wave burst signals. In practice, without a means to distinguish noise glitches from true GW signals, the sensitivity of a burst search will be limited by such glitches. Coherent analyses can be particularly susceptible to such false alarms, since even a glitch in a single detector will produce large values for likelihoods such as E SL . In this section we outline a technique for the effective suppression of such false alarms in coherent analyses.
As shown in Chatterji et al. [14], one can use the autocorrelation component of coherent energies to construct tests that are effective at rejecting glitches. This coherent veto test is based on the null space -the subspace orthogonal to that used to define the standard likelihood. The projection of d on this subspace contains only noise, and the presence or absence of GWs should not affect this projection in any way. By contrast, glitches do not couple into the data streams with any particular relationship to F + , F × . As a result, glitches will generally be present in the null space projection. This provides a way to distinguish true GWs from glitches, by requiring the null energy to be small for a transient to be considered a GW [28].
To see how an effective test can be constructed, note that we can write equation (2.20) for the null energy as As pointed out in Chatterji et al. [14], the null energy is composed of cross-correlation termsd † αdβ and autocorrelation termsd † αdα . If the transient signal is not correlated between detectors (as is expected for glitches), then the cross-correlation terms will be small compared to the auto-correlation terms. As a consequence, for a glitch we expect the null energy to be dominated by the auto-correlation components: This auto-correlation part of the null energy is called the incoherent energy. By contrast, for a GW signal, the transient is correlated between the detectors according to equations (2.1) or (2.9)-(2.11). By construction of the null projection operator, these correlations cancel in the null stream, leaving only Gaussian noise. They cannot cancel in I null , however, since that is a purely incoherent statistic. Therefore, for a strong GW signal we expect Based on these considerations, the coherent veto test introduced by Chatterji et al. [14] is to keep only transients with where C is some constant greater than 1. This test is particularly effective at eliminating large-amplitude glitches. For smaller amplitude glitches E null can be small compared to I null due to statistical fluctuations; for this reason in X-Pipeline we use a modified test where the effective threshold C varies with the event energy, as discussed in Section III D.
Analogous tests can be imposed on the other coherent energies, E + , E × , etc.. We define the corresponding incoherent energies by In each case, we compare the coherent energy E to its incoherent counterpart I, making use of the expectation that for a glitch, E I. For a strong GW, the signal summed over both polarizations should build coherently, so one will find By contrast, one may find E + > I + or E + < I + depending on the polarization of the GW signal. Specifically, if the GW signal is predominantly in the + polarization in the DPF, then one will find If the GW signal is predominantly in the × polarization in the DPF, then one will find the reverse: . The std(2E) column shows the standard deviation due to noise fluctuations assuming a non-aligned detector network (i.e., |f × | > 0). The values for E soft are written as approximate because the weighting factor is itself a function of frequency. All of these values assume the time-frequency region to sum over and the correct sky location are known a priori.
In general, a GW will be characterised by at least one of E + > I + or E × > I × ; i.e., at least one of the polarizations will show a coherent buildup of signal-to-noise across detectors. This allows us to impose coherent glitch rejection tests even in the case where a null stream is not available, such as the H1-L1 network of LIGO detectors. Specific examples of coherent consistency tests are discussed in Sections III D and IV. These incoherent energies are not defined as the magnitude of a projection. As a result, they do not obey χ 2 statistics. They do, however, obey a simple relation with the coherent energies: Equivalently, the sum of the cross-correlation contributions to E + , E × , and E null cancel: (2.50)

III. OVERVIEW OF X-PIPELINE
X-Pipeline is a matlab-based software package for performing coherent searches for gravitational-wave bursts in data from arbitrary networks of detectors. In this section we give an overview of the main steps followed in a triggered burst search, describing how the data is processed and how candidate GWBs are identified. In Section V we discuss how an X-Pipeline analysis is triggered.

A. Preliminaries
X-Pipeline performs the coherent analyses described in Section II. The user (a human or automated triggering software) specifies: 1. a set of detectors; 2. one or more intervals of data to be analysed; In standard usage, X-Pipeline processes the data and produces lists of candidate gravitational-wave signals for each of the specified sky positions. It does this by first constructing time-frequency maps of the various energies in the reconstructed h + , h × , and null streams. X-Pipeline then identifies clusters of pixels with large values of one of the coherent energies, such as E SL or E + .
B. Time-frequency maps X-Pipeline typically processes data in 256 s blocks. First, it loads the requested data. It constructs a zerophase linear predictor error filter to whiten the data and estimate the power spectrum [14,40]. For each sky position, X-Pipeline time-shifts the data from each detector according to equations (2.1) and (2.2). The data is divided into overlapping segments and Fourier transformed, producing time-frequency maps for each detector. Given the time-frequency maps for the individual detector data streamsd, X-Pipeline coherently sums and squares these maps in each pixel to produce timefrequency maps of the desired coherent energies; see Figure 2. This representation gives easy access to the temporal evolution of the spectral properties of the signal, and all statistics and other quantities that are functions of time and frequency.

C. Clustering and Event Identification
Given time-frequency maps of each of the coherent energies, the challenge is then to identify potential gravitational-wave signals in these maps.
The approach used in X-Pipeline is pixel clustering [18]. The user singles out one of the energy measures - typically E SL , the summed energy in the reconstructed h + and h × streams -as the detection statistic. A threshold is applied to the detection statistic map so that a fixed percentage (e.g., 1%) of the pixels with the highest value in the current map are marked as black pixels; see Figure 2. Following the method of [18], black pixels that share a common side (nearest neighbors) are grouped together into clusters; see Figure 3 for an example. (As allowed in [18], the user may specify a different connectivity criterion, such as next-nearest neighbors, or apply the "generalized clustering" procedure.) This clustering technique is appropriate for a GWB whose shape in the time-frequency plane is connected, as opposed to consisting of well-separated "blobs". This assumption is valid for many well-modeled signals such as low-mass inspirals and ringdowns.
Each cluster is considered a candidate detection event. Each is assigned a detection statistic value from its constituent pixels by simply summing the values of the statistic in the pixels. This is motivated by the additive property of the log-likelihood ratio -the inherited detection statistic is exactly the detection statistic for the area defined by the cluster. Each cluster is also assigned an approximate statistical significance S based on the χ 2 distribution; see equation (2.32). This significance is used when comparing different clusters to determine which is the "loudest" -the best candidate for being a gravitational wave signal. Finally, the energy at the same time-frequency locations in maps of each of the other requested likelihoods is also computed and recorded for each cluster.
The clusters are saved for later post-processing. The analysis of time shifting (see equation (2.2)), FFTing, and cluster identification is then repeated for each of the other sky positions and for each of the requested FFT lengths.
One other important feature of the time-frequency maps is the Fourier transform length or analysis time T , which determines the aspect ratio of the pixels. A longer time gives pixels with poor time resolution but good frequency resolution; a shorter time gives pixels with good time resolution but poor frequency resolution. Depending on the signal duration, different analysis times may be optimal. Since each pixel has the same noise distribution (assuming Gaussian statistics), the optimal pixel size is the size for which the signal spans the smallest number of pixels, so that the statistic is least polluted by noise.
Since the optimal analysis time for the incoming signal is not known, X-Pipeline uses several analysis times, and applies a second layer of clustering between analysis times. For this second layer of clustering, clusters made from black pixels at two different analysis times that overlap in time and frequency are compared. The cluster that has the largest significance is kept as a candidate event; the less significant overlapping clusters are discarded.

D. Glitch rejection
As noted in Section II, noise glitches tend to have a strong correlation between each coherent energy E null , E + , E × , and its corresponding incoherent energy I null , I + , I × . X-Pipeline compares the coherent and incoherent energies to veto events that have properties similar to the noise background. These coherent veto tests are applied in post-processing (i.e., after candidate events from the different analysis times are generated and combined).
Two types of coherent veto are available in X-Pipeline. Both are pass/fail tests. The simplest is a threshold on the ratio I/E. Following the discussion in Section II F, a cluster passes the coherent test if I null /E null ≥ r null , (3.1) | log 10 (I + /E + )| ≥ log 10 (r + ) , (3.2) | log 10 (I × /E × )| ≥ log 10 (r × ) , (3.3) where the thresholds r null , r + , and r × may be specified by the user or chosen automatically by X-Pipeline. The form of equations (3.2) and (3.3) make these tests two-sided; i.e., they pass clusters that are sufficiently far above or below the diagonal. The second type of coherent veto test in X-Pipeline is called the median-tracking veto test. In this test, the exclusion curve is nonlinear and designed to approximately follow the measured distribution of background clusters.
Examination of scatter plots of I vs. E for background clusters shows that, while I E for loud glitches, there is a bias to I > E at low amplitudes. Furthermore, the width of the distribution of background events around the diagonal varies with E. A simple scaling argument shows that for large-amplitude uncorrelated glitches we expect Specifically, for a large single-detector glitchg(f ), the correlation with the noiseñ(f ) in another detector will have mean zero and variance ∝ |g| 2 ∝ I. Consequently, we expect noise events to be scattered about the diagonal with a width that is proportional to I 1/2 (recall that the energies are dimensionless quantities). The mediantracking test uses this information by estimating the median value of I as a function of E for background events.
For each cluster to be tested, it computes the following simple measure n σ of how far the cluster is above or below the median: An event is passed if As in the ratio test, the thresholds for each energy type are independent and may be specified by the user or selected automatically by X-Pipeline. The median function I med (E) is estimated as follows. First, a set of background clusters are binned in log 10 E and the median values of log 10 E and log 10 I in each bin are measured. A quadratic curve of the form log 10 I = a(log 10 E) 2 + c (3.9) is fit to these sampled medians. The quadratic is merged smoothly to the diagonal I = E above some value of E. This shape is entirely ad hoc, but in practice it provides a good fit the observed distribution of glitches. An example of the median-tracking coherent glitch veto is shown in Figure 4. Each plus symbol (+) denotes a background cluster, colored by its significance log 10 S. The large mass of light points at lower left are weak background noise events. The darkly colored points extending along the diagonal to upper right are strong background noise events. Also shown are clusters due to a series of simulated gravitational-wave signals added to the data, denoted by squares (2). Even though many of these simulated signals are weaker (lighter) than the strong background noise glitches, they are well separated from the background noise population in the two-dimensional (E null , I null ) space. The dashed line shows the coherent veto threshold placed on (E null , I null ); points below this line are discarded. Scatterplots of I + vs. E + and I × vs. E × have similar appearance; see Section IV for examples.
In addition to the coherent glitch vetoes, clusters may also be rejected because they overlap data quality vetoes. These are periods when one or more detectors showed evidence of being disturbed by non-gravitational effects that are known to produce noise glitches. Such sources include environmental noise and instabilities in the detector control systems. These data quality vetoes are defined by studies of the data independently of X-Pipeline, and hence are outside of the scope of this report. See [11,12] for recent reviews of data quality and detector characterisation efforts in the LIGO Scientific Collaboration and the Virgo Collaboration.

E. Triggered search: tuning and upper limits
We now focus on the strategy for conducting triggered searches with X-Pipeline, specifically searches for gravitational waves associated with gamma-ray bursts (GRBs). As pointed out by Hayama et al. [29], GRB searches are an excellent case for the application of coherent analysis, since the sky position of the source is known a priori to high accuracy. We can therefore take full advantage of coherent combinations of the data streams without the false-alarm or computational penalties of scanning over thousands of trial sky directions.

Detection Procedure
For the purposes of a search for unmodelled gravitational-wave emission, a GRB source is characterised by its sky positionΩ, the time of onset of gammaray emission (the trigger time) t 0 , and by the range of possible time delays ∆t between the gamma-ray emission and the associated gravitational-wave emission. The latter quantity is referred to as the on-source window for the GRB; this is the time interval which is analysed for candidate signals. LIGO searches for gravitational wave bursts associated with GRBs [10,41,42] have traditionally used an asymmetric on-source window of [t 0 − 120 s, t 0 + 60 s], which is conservative enough to encompass most theoretical models of gravitational-wave emission for this source, as well as uncertainties associated with t 0 [3,41].
In order to claim a detection of a gravitational wave, we need to be able to establish with high confidence that a candidate event is statistically inconsistent with the noise background. In X-Pipeline GRB searches, we use the loudest event statistic [43,44] to characterise the outcome of the experiment. The loudest event is the cluster in the on-source interval that has the largest significance (after application of vetoes); let us denote its significance by S on max . We compare S on max to the cumulative distribution C(S max ) of loudest significances measured using background noise (discussed below). We set a threshold on C(S max ) such that the probability of background noise producing a cluster in the on-source interval with significance above this threshold is a specific small value (for example, a 1% chance). The on-source data is then analysed. If the significance C(S on max ) of the loudest cluster is greater than our threshold, we consider the cluster as a possible gravitational wave detection. We can also set an upper limit on the strength of gravitational-wave emission associated with the GRB in question.
In principle, the cumulative distribution C(S max ) of loudest-event significances for clusters produced by Gaussian background noise can be estimated a priori. In practice, however, real detector data is non-Gaussian. The most straightforward procedure for estimating the background distribution is then simply to analyse additional data from times near the GRB, but outside the onsource interval. These data are referred to as off source.
The off-source clusters will not contain a gravitationalwave signal associated with the GRB, and so they can be treated as samples of the noise background. In X-Pipeline, we divide the off-source data into segments of the same length as that used for the on-source data, and analyse each segment in exactly the same manner as the on-source data (using, for example, the same source direction relative to the detectors for computing coherent combinations). For each segment, we determine the significance of the loudest event after applying vetoes. This collection of loudest-event significances from the offsource data then serves as the empirical measurement of C(S max ).
In X-Pipeline we typically set the off-source data to be all data within ±1.5 hours of the GRB time, excluding the on-source interval. This time range is limited enough so that the detectors should be in a similar state of operation as during the GRB on-source interval, but long enough to provide typically ∼50 off-source segments for sampling C(S max ), thereby allowing estimation of probabilities as low as ∼2%. To get still better estimates of the background distribution, we also analyse off-source data after artificially time-shifting the data from one or more detectors by different amounts ranging from a few seconds to several hundred seconds. These shifts can give up to approximately 1000 times the on-source data for background estimation, allowing estimation of probabilities at the sub-1% level.
Networks containing both the LIGO-Hanford detectors, H1 and H2, present a special case for background estimation, as local environmental disturbances can produce simultaneous background glitches which are not accounted for in time slides. We therefore do not time-shift H1 relative to H2 unless they are the only detectors operating. In that case, the local probability is computed both with and without time slides to allow a consistency check on the background estimation. (Triggered searches with second-scale on-source windows have the advantage of not requiring time shifts at all; see for example [45].) In practice, we do not see significant differences due to correlated environmental disturbances. We attribute this robustness to the coherent glitch rejection tests described in Section III D.

Upper Limits
The comparison of the largest significance measured in the on-source data, S on max , to the cumulative distribution C(S max ) estimated from the off-source data allows us to determine if there is a statistically significant transient associated with the GRB. If no statistically significant signal is present, we set a frequentist upper limit on the strength of gravitational waves associated with the GRB. For a given gravitational-wave signal model, we define the 90% confidence level upper limit on the signal amplitude as the minimum amplitude for which there is a 90% or greater chance that such a signal, if present in the on-source region, would have produced a cluster with significance larger than the largest value S on max actually measured.
We adopt the measure of signal amplitude that is standard for LIGO burst searches, the root-sum-squared amplitude h rss , defined by The units of h rss are Hz −1/2 , the same as for amplitude spectra, making it a convenient quantity for comparing to detector noise curves. For narrow-band signals, the h rss can also be linked to the energy emitted in gravitational waves under the assumption of isotropic radiation via [46] where D is the distance to the source and f 0 is the dominant frequency of the radiation. One drawback of h rss is that it does not involve the detector sensitivity (either antenna response or noise spectrum). As a result, upper limits phrased in terms of h rss will depend on the family and frequency of waveforms used, and also on the sky position of the source.
To set the upper limit, we need to determine how strong a real gravitational-wave signal needs to be in order to appear with a given significance. We do this using a third set of clusters, one which contains sample gravitational-wave signals. Specifically, we repeatedly reanalyse the on-source data after adding ("injecting") simulated gravitational-wave signals to the data from each detector. The data is then analysed as before, producing lists of clusters. The significance associated with a given injection is the largest significance of all clusters that were observed within a short time window (typically 0.1s) of the injection time, after applying vetoes.
The procedure for setting an upper limit is: 1. Select one or more families of waveforms for which the upper limit will be set. For example, a common choice in LIGO is linearly polarized, Gaussianmodulated sinusoids ("sine-Gaussians") with fixed central frequency and quality factor, and random peak time and polarization angle.
2. Find the significance S on max of the loudest event in the on-source data, after applying the coherent glitch veto (Section III D) and any data-quality vetoes.
3. For each waveform family: (a) Generate random parameter values for a large number of waveforms from the family (e.g., specific peak times and polarization angles for the sine-Gaussian case), and with fixed h rss amplitude. (b) Add the waveforms one-by-one to the onsource data, and determine the largest significance of any surviving cluster (after vetoes) associated with each injection. (c) Compute the percentage of the injections that have S ≥ S on max . (d) Repeat 3a-3c using the same waveform family but with different h rss amplitudes. The 90% confidence-level upper limit is that h rss for which 90% of the injections have S ≥ S on max .

Tuning and Closed-Box Analyses
The sensitivity of the pipeline is determined by the relative significance of the clusters produced by real gravitational-wave signals to those produced by background noise. This in turn depends on the details of how the analysis is carried out. In particular, the thresholds used for the coherent glitch rejection tests will have a significant impact on the sensitivity. Too low a threshold will allow background noise glitches to survive, and possibly appear louder than a real gravitational-wave signal. Too high a threshold may reject the gravitational-wave signals we seek.
To improve the sensitivity of X-Pipeline searches, we tune the coherent glitch test to optimize the trade-off between glitch rejection and signal acceptance. We do this using a closed-box analysis. A closed-box analysis estimates the pipeline sensitivity using the off-source and injection data, but not the on-source data. This blind tuning avoids the possibility of biasing the upper limit.
The procedure used for a closed-box analysis follows that used for computing an upper limit, except that an off-source segment is used as a substitute for the true on-source segment. We then test different thresholds for the coherent veto tests, and select the threshold set that gives us the best average "upper limit" estimated from the off-source segments. Specifically, we do the following: 1. For each coherent veto test (E + vs. I + , E × vs. I × , E null vs. I null ) we select a discrete set of trial veto thresholds to test.
2. The off-source segments and the injection clusters are divided randomly into two equal sets: one for tuning, and one for upper-limit estimation.
3. For each distinct combination of trial thresholds (r + , r × , r null ), we do the following: (a) We apply the coherent veto test (and any data quality vetoes) to the background clusters from each of the tuning off-source segments. The collection of loudest surviving events from each segment gives us C(S max ) for that set of trial thresholds.
(b) We determine the off-source segment that gives the loudest event closest to the 95 th percentile of the off-source S max (i.e., closest to C(S max ) = 0.95). This off-source segment is termed the dummy on-source segment. (Different background segments may serve as the dummy on-source for different trial values of the coherent veto thresholds.) (c) The dummy on-source clusters and the tuning injection clusters are read, and the coherent vetoes and data-quality vetoes are applied to each. The upper limit is computed, treating the dummy clusters as the true on-source clusters.
4. The final, tuned veto thresholds are the ones that give the lowest upper limit based on the dummy on-source clusters. (If testing multiple waveform families, the upper limits may be averaged across families for deciding the optimal tuning.)

5.
To get an unbiased estimate of the expected upper limit, we apply the tuned vetoes to the second set of off-source and injection clusters, that were not used for tuning. Steps 3a -3c are repeated using the final thresholds, and using the 50 th percentile of S max to choose the dummy on-source segment. The upper limit estimated from the dummy on-source segment in this second data set is the predicted upper limit for the GRB; equivalently, it may be interpreted as the sensitivity of the search.
We choose the 95 th percentile of S max for tuning to focus on eliminating the tail of high-significance background glitches. This is a deliberate choice, since to be accepted as a detection, a GWB will need to stand well clear of the background. We choose the 50 th percentile of S max as the dummy on-source value for sensitivity estimation because this is our best prediction for the typical value of S max in the on-source data under the null hypothesis. Separate data sets are used in tuning and sensitivity estimation to avoid bias from tuning the cuts on the same data used to estimate the sensitivity. The data set used for closedbox sensitivity estimates is later re-used for computing event probabilities and upper limits for the "open-box" (true on-source) data; this introduces no bias because no tuning decisions are made based on the closed-box sensitivity estimate. In X-Pipeline, the tuning and upper limit calculations are automated. The closed-box analysis is performed first using a pre-selected range of trial thresholds for the coherent glitch test. A web page is generated automatically reporting the details of the closed box analysis, including the optimized threshold values and the predicted upper limits. For the S5/VSR1 search, the user re-runs the post-processing on the on-source data with the fixed optimized thresholds, and another web page report is generated listing detection candidates and upper limits. For the S6/VSR2 search, we propose to automate this "box opening" as well, so that the on-source events are scanned for candidate GWBs immediately once the closed-box tuning analysis has finished.

Statistical and Systematic Errors
There are several sources of error that can affect our analysis. The principal ones are calibration uncertainties (amplitude and phase response of the detectors, and relative timing errors), and uncertainty in the sky position of the GRB.
X-Pipeline is able to account for these effects automatically in tuning and upper limit estimation. Specifically, X-Pipeline's built-in simulation engine for injecting GWB signals is able to perturb the amplitude, phase, and time delays for each injection in each detector. The perturbations are drawn from Gaussian distributions with mean and variance matching the calibration uncertainties for each detector. Furthermore, the GRB sky position can be perturbed in a random direction by a Gaussian-distributed angle with standard deviation set to the GRB error box width reported by the GCN. Tuning and upper limits based on the perturbed injections are effectively marginalized over these sources of error.
For the S5-VSR1 GRB search, the capability for perturbed injections was not available at the time of the original data analysis, and so the impact of the errors was estimated by re-analysis of a small subset of the full GRB sample. For the S6-VSR2 search, we include calibration and sky-position uncertainties in simulations for all GRBs from the beginning, removing the need to do any additional error analysis.
GRB 031108 occurred during the third science run of the LIGO Scientific Collaboration ("S3"). At that time, the two LIGO-Hanford detectors H1 and H2 were operating, while the Livingston detector L1 was not. A search for gravitational waves associated with the GRB was performed using a cross-correlation algorithm, and reported in Abbott et al. [42].
To demonstrate X-Pipeline, we perform a closed-box analysis [54] of the LIGO H1-H2 data to search for gravitational waves associated with GRB 031108. We tune the search and estimate its sensitivity to gravitational-wave emission as discussed in Section III E, using the same simulated waveforms as in Abbott et al.. We compare the sensitivity results to those of the cross-correlation search in Abbott et al. We estimate the 90% confidence upper limits from X-Pipeline to be typically 40% lower than those from the cross-correlation search.

A. Analysis
At the time of GRB 031108, the two LIGO Hanford detector H1 and H2 were operating. Figure 5 shows the noise level in the detectors at that time. Since the H1 and H2 detectors have identical antenna responses, the network is sensitive to only one of the two gravitational-wave polarizations from any given sky direction. In the DPF, this means that f × = 0. As a consequence, the cross energy also vanishes identically, E × = 0, and E SL = E + . Each event cluster is therefore characterised by the two coherent energies E + and E null , and their associated incoherent components I + and I null . Figure 6 shows the weighting factors e + as a function of frequency.
X-Pipeline was run on all data within ±1hr of the GRB time for background estimation. Clusters were generated using Fourier transform lengths of 1/8s, 1/16s, 1/32s, 1/64s, 1/128s, and 1/256s. Figure 7 shows scatter plots of I + vs. E + and I null vs. E null for the half of the offsource clusters that were used for upper limit estimation (i.e., after tuning). Also shown are the clusters produced by simulated sine-Gaussian GWBs at 150 Hz, one of the types tested in [42]. These injections had amplitudes of 6.3×10 −21 Hz −1/2 , approximately equal to the h rss upper limit estimated from the closed-box analysis.
As expected, loud background triggers fall close to the diagonal in both of these plots. The simulated gravitational waves also fall close to the diagonal for I + vs. E + ; this is due to the fact that H2 is significantly less sensitive than H1 and so receives very little weighting in the calculation of E + . In turn, this means that the H1-H2 cross terms in E + are small compared to the H1-H1 term, so that E + is dominated by the diagonal components and so is very similar to I + . For the null stream, however, the weightings are reversed, and H2 is weighted higher than H1. As a consequence, gravitational waves lie above the diagonal in the I null vs. E null plot, and it is possible to separate the injections from the background clusters in (E null , I null ) space. X-Pipeline's automated tuning procedure recognizes both of these facts; when run using the median-tracking veto test, it estimates that the best sensitivity will come from requiring a threshold of r + = 5 on (E null , I null ), and imposing no condition on I + vs. E + . The (E null , I null ) threshold is indicated in Figure 7 by the dashed line; points below this line are discarded. As can be seen, this test rejects the majority of the loud off-source clusters, while accepting most of the simulated gravitational wave clusters. The off-source clusters that survive the test tend to be of low significance, and therefore will not affect the loudest-event upper limit. Figure 8 shows the distribution of S max before and after the null-stream test.
The closed-box analysis discussed in Section III E was used to tune the coherent veto test and estimate the expected upper limit from X-Pipeline. Figure 9 shows a scatter plot of the "dummy" on-source clusters. Recall that the dummy on-source region is selected as the background segment that gives the median loudest event surviving the coherent veto test. It therefore represents the expected typical result under the null hypothesis, averaging over noise instantiations, and so is a more robust  (2) cluster likelihoods: I+ vs. E+ (top) and I null vs. E null (bottom). The color denotes log 10 (S). Loud background triggers fall close to the diagonal. Simulated gravitational waves also fall close to the diagonal for I+ vs. E+, but above the diagonal for I null vs. E null . The dashed line denotes the coherent consistency threshold on (E null , I null ) that is selected by X-Pipeline's automated tuning procedure; points below this line are discarded. This test rejects the majority of the loud off-source clusters, while accepting most of the simulated gravitational wave clusters, even if the GWB significiance is typical of background events. The simulated signals in this plot have hrss = 6.3 × 10 −21 Hz −1/2 , approximately equal to the upper limit estimated from the closed-box analysis.
way to estimate the pipeline sensitivity than, e.g., picking a random segment (or even the on-source segment).
The predicted h rss upper limits at 90%-confidence for narrow-band sine-Gaussian waveforms of different central frequencies are shown in Table III and Figure 10. Table III also shows the actual upper limits from the cross-correlation search reported in [42]. The predicted X-Pipeline sensitivity is approximately a factor of 1.7 better than that of the cross-correlation pipeline, corresponding to an increase in search volume of a factor of 1.7 3 5. Similar improvements were seen in the openbox analysis of GRBs in the S5-VSR1 run (2005-2007) [15].
As can been seen in Figure 10, the limiting amplitudes for this GRB track the noise spectrum of H2, and correspond to a matched-filter signal-to-noise ratio of ap-   [42]. The units are 10 −21 Hz −1/2 . The simulated waveforms are circularly polarized sine-Gaussians as described in [42].
proximately 5 in H2. This occurs because the sensitivity of the analysis is limited by the coherent glitch rejection test. This test requires a measurable correlation between the detectors, which in turn requires that the GWB have some minimal signal-to-noise ratio in each. This behaviour is typical of tuning using the 95 th percentile of S max , which is an aggressive choice designed to suppress the loud background. While the upper limits tend to be limited by such strong background rejection, our ability to detect a GWB is enhanced, since a GWB candidate will undoubtedly need a significance higher than some very high percentile of the background to be claimed as an actual gravitational wave.
The factor of 1.7 sensitivity improvement of X-Pipeline relative to the cross-correlation search in [42] can be attributed in part to two factors. We estimate that a factor of approximately 1.3 comes from using E SL rather than the cross-correlation as the detection statistic. E SL includes the auto-correlation terms (d H1d * H1 ,d H2d * H2 ) in addition to the cross-correlation terms (d H1d * H2 ) when combining the H1 and H2 data streams. This gives a net increase in the signal-to-noise ratio. More precisely, one can compute the ratio of the expected contribution to E SL due a GWB to the standard deviation in E SL due to Gaussian noise; see Section II E. Performing the same calculation for the crosscorrelation statistic, one finds the per-pixel ratio for E SL to be 1.8 ∼ 1.3 2 times larger than that for the crosscorrelation (assuming a 2:1 ratio in the noise amplitudes for H2:H1). Another factor of ∼ 1.2 can be attributed to the clustering, which restricts the likelihood calculation to pixels that show significant signal power (and thus tending to exclude pixels that contain only background noise). The cross-correlation statistic in [42] was computed on a minimum time-frequency volume (number of pixels) of approximately 50. By contrast, the typical cluster size in X-Pipeline was found to be 10-30 for injections at the 90% upper limit amplitude. As seen in Section II E and [17], the amplitude sensitivity in Gaussian noise scales as N −1/4 . The factor of ∼ 2 smaller number of pixels used by X-Pipeline should therefore give a factor of ∼ 2 1/4 = 1.2 sensitivity improvement. Combined with the previous factor of 1.3 gives a total improvement of about 1.6. While this is very close to the average measured improvement, one should keep in mind that these rough estimates have not properly accounted for the non-Gaussianity of the background (which will decrease the sensitivity of both pipelines), or for the tendency of the coherent glitch rejection test to limit the X-Pipeline sensitivity in the absence of strong background glitches. These other effects are presumably also important.
V. AUTONOMOUS RUNNING X-Pipeline has been used to process data from S5-VSR1 (2005)(2006)(2007). This is an "offline" search, being completed almost two years after the last of the GRBs in question was observed. In parallel, X-Pipeline is being improved for the S6-VSR2 run, which started in July 2009. Our goal for S6-VSR2 is fully autonomous running, with a complete analysis of each GRB within 24 hours of the trigger. To achieve this goal requires automatic triggering of X-Pipeline.

A. Automated launch of X-Pipeline by GCN triggers
Most of the information for sources which are analyzed by various externally triggered burst searches in LIGO-Virgo come from the GRB Coordinates Network (GCN) [48]. GCN notices and circulars are received in real time by LIGO-Virgo, and the information needed for the search analyses are parsed automatically by perl scripts which are launched each time a GCN notice or circular is received. The information parsed includes: the time and date of the event, the source position (right ascension and declination), the position error, and the duration of the event. For each source, these parameters are compiled and written to a trigger file.
Concurrently, a perl script runs at a central computing site and regularly checks if there are new source events listed in the trigger file. When there are new triggers, the script checks for availability of the LIGO-Virgo data which are necessary for analyzing the source. If the needed data are available, the script launches X-Pipeline event-generation jobs (which include simulation and off-source analyses) on the computing cluster. These jobs are monitored continuously to automatically determine when the jobs have finished. Once they are completed, the post-processing (tuning and detection/upper limit) jobs are automatically launched and likewise monitored. Successful completion of these steps results in a web page in which the results of the analysis are presented, and an email notification being sent to human analysts. Additionally, the scripts which monitor the status of the search and post-processing jobs log that progress for each source event and regularly write this information to a summary status web page. These GCN parsing and triggering scripts are now operational, and X-Pipeline is currently autonomously analysing GRBs from the Swift [49] satellite. Open-box results are available in as little as 6 hours following a GCN alert.
Other modifications currently being made to X-Pipeline focus on the larger sky position error boxes from the Fermi satellite [50]. For S6-VSR2, most of the GRB triggers come from the GBM instrument on Fermi, which gives a typical position uncertainty of several degrees. This is much larger than the typical uncertainty of a few arcmin for GRBs from Swift in S5-VSR1. The X-Pipeline launch scripts are currently being modified to set up a grid of sky positions covering this error region, and the handling of events is being modified to minimize the additional computational time required. Finally, the suite of simulated waveforms has been expanded to include binary neutron star and black-hole-neutron-star binary inspirals, since these systems are widely thought to be the progenitors of short GRBs.

VI. SUMMARY
X-Pipeline is a software package designed to per-form autonomous searches for gravitational-wave bursts associated with astrophysical triggers such as gammaray bursts. It performs a fully coherent analysis of data from arbitrary networks of detectors to sensitively search small patches of the sky for gravitational-wave bursts. X-Pipeline features automated tuning of background rejection tests, and a built-in simulation engine with the ability to simulate effects such as calibration uncertainties and sky position errors. X-Pipeline can be launched automatically by receipt of a GCN email, performing a complete analysis of data, including tuning and identification of GWB candidates, without human intervention. Each astrophysical trigger is analysed as a separate search, with background estimation and tuning performed using independent data samples local to the trigger. In a test on actual detector data for a real GRB, we find that X-Pipeline is sensitive to signals approximately a factor of 1.7 weaker than those detectable by the cross-correlation technique used in previous LIGO searches. X-Pipeline has recently been used for the analysis of GRBs from from the LIGO-Virgo S5-VSR1 run, and is currently running autonomously during the S6-VSR2 run to search for gravitational waves associated GRBs observed electromagnetically. Our goal is the rapid identification of possible GWBs on time scales short enough to prompt additional follow-up observations by other observatories.