• Tidak ada hasil yang ditemukan

3.3 Non-stationary and non-Gaussian noise

3.3.3 Signal consistency tests

To combat non-Gaussian behavior in the data, we perform additional statistical tests that go beyond the signal-to-noise. The most common of these types of tests areχ2tests, which measure the residual power in the data stream after subtracting off the inferred signal. If the residual is consistent with the Gaussian assumption, theχ2value should be low; otherwise theχ2should be large (see Fig. 3.4).

As with the SNR, theseχ2are sensitive to disagreements between the filter template and the actual signal. Large values ofχ2can arise from either non-Gaussian fluctuations or mismatch between the true waveform and the closest matching template in the bank.

72

There are several types ofχ2 statistics used in LIGO data analysis (see Chap. 6 in Ref. [75] for a brief review). Here we describe two particular tests, which will appear in pipelines later in this work. The first test (traditionalχ2) is a frequency-domain test which has featured in the most recent LIGO/Virgo searches for GWs from CBCs. The second test (autocorrelationχ2) is a time-domain test, which is the currently implementedχ2statistic used in the newgstlalpipeline to be described in Chap. 5.

Traditional χ2

Recent LIGO/Virgo searches for GWs from CBCs employed aχ2 statistic [76] based on the consid- eration that although a detector glitch may generate triggers with the same SNR as a GW signal, the manner in which the SNR is accumulated over time and frequency is likely to be different. For example, a glitch that resembles a delta function corresponds to a burst of signal power concentrated in a small time-domain window, but smeared out across all frequencies. A CBC waveform, on the other hand, will accumulate SNR across the duration of the template, consistently with the chirp-like morphology of the waveform.

To test whether this is the case, the template is broken intopfrequency bands with boundaries f0= 0< f1< f2< ... < fp1< fp=∞in such a way that the SNR accumulated in each frequency interval,

zj = 4Re Z fj

fj−1

˜ s(f)ˆh(f)

Sn(f) df, (3.28)

has the same expectation valuehzji=z/p, wherez=Pzj is total SNR achieved by the template.

Theχ2 statistic is computed from these by

χ2= 1 p

Xp j=1

(z−pzj)2, (3.29)

which is χ2-distributed with p degrees of freedom. The quantity pzj is the expected total SNR based on the SNR in frequency band [fj1, fj]. This technique is highly effective for suppressing the background when the signal duration is long, but computationally is rather costly, requiringp inverse Fourier transforms per template. Thus, in theIHOPEpipeline, which implements this test, theχ2 is computed only for triggers that have passed the coincidence stage (see Sec. 3.4).

Autocorrelation χ2

Another useful signal consistency test is obtained by comparing the SNR time series obtained from filtering the data to the autocorrelation of the template. This statistic has the main advantage that once the SNR time series is computed for a given template, all the required data to compute the statistic are already in memory; the statistic is extremely computationally efficient compared the traditional χ2. The autocorrelation χ2 statistic is closely related to the bank χ2, as described in Ref. [75].

Suppose we collect datas=n+Ahwe suspect to contain a signalAh, whereAis some amplitude and hh, hi= 1. Then in our filtering, as we maximize the signal to noise over time translations of the templateh, we measure the SNR time series

ρ(τ) =hn, he2πif τi+Ahh, he2πif τi (3.30)

See Eqn. 3.7. We define

α(τ)≡ hh, he2πif τi, (3.31) which is the autocorrelation of the whitened template. The autocorrelation takes its maximal value of 1 at τ = 0. In Eqn. 3.30, τ is to be interpreted as the difference in time between the actual coalescence time and the time-shifted template. In actual analysis, in which we do not know the actual coalescence time, we would takeτ= 0 to correspond to the time of the maximum SNR.

These considerations give us an estimate for the amplitudeAby maximizing Eqn. 3.30 over time and taking an ensemble average, so that the noise term disappears, givingA≈ hρmaxi. Now put all the measurable quantities on the same side of the equality to find that

ρ(τ)−ρmaxα(τ) =hn, he2πif τi. (3.32)

On the right, we have only the noise term, which will be Gaussian distributed if the noise n is Gaussian. On the left, we have only quantities that are measurable from the data and templates.

We can exploit this to compute aχ2 statistic from a given trigger, namely

χ2 =

Z Tmax

0 |ρ(τ)−ρmaxα(τ)|2dτ (3.33)

=

Z Tmax

0 |hn, he2πif τi|2dτ. (3.34)

74

Here Tmax is a tunable parameter that corresponds to the number of degrees of freedom in the χ2. If the SNR time series is computed with time resolution ∆t, then the number of degrees of freedom isN =Tmax/∆t.

In Fig. 3.4, we show an example of the statistical properties of the background triggers compared to triggers arising from simulated signals added to real detector data. In Gaussian noise, the SNR is the optimal detection statistic, and furthermore the background distribution in SNR falls expo- nentially. Here we see that the background has long excursions into the high SNR regime where, without the additional χ2 statistic, it would blend into the signal. However, background triggers usually also have higherχ2 than one would expect from signals. Therefore, theχ2 test helps to dis- tinguish between background and signal when the background has non-Gaussian fluctuations. We note that in the high SNR regime, simulated signalsalso obtain largeχ2. This fact arises from the discreteness of the template bank. Typically the simulated signal does not exactly match any of the templates in the bank. As the signal gets louder, the small fractional mismatch (e.g. Mmin= 0.97) becomes amplified and pulls the signal dangerously close to the background. In particular, since our searches have (so far) used non-spinning templates, a signal from a compact binary with spin could register in the pipeline with a largeχ2, obscuring it into the background. Thus, while the SNR is optimal in Gaussian noise with known signals, it is decidedly non-optimal when the background in non-Gaussian or the actual signal does not match the template. The latter observation provides even more motivation for the move to including more physical effects in the templates. We want not only to maximize the recovered signal-to-noise, but also to minimize theχ2for signal.