Future work - Conclusions and additional benefits of this approach

6.3 Conclusions and additional benefits of this approach

6.3.1 Future work

Our tests have indicated that we have reached the limit of extracting information via a KleineWelle analysis of the auxiliary channels with our available tools (MLAs). Future improvement in classification efficiency is therefore likely to come from including additional sources of useful information, rather than refinements to the algorithms themselves. There are many other ways to extract information from the auxiliary channels.

A simple change would be to use the Omega algorithm described in Section 4.1.2 instead of the KleineWelle algorithm to identify transients in the auxiliary channels. However, we do not have to stop there. Each auxiliary channel’s data are a time series, and for certain channels it makes sense to use the value, derivative, and acceleration of the channel at a specific time; channels we expect to be useful in this way are the “slow”

channels described in Section 4.2.1.4, such as those monitoring the angular positions of the mirrors.

An advantage of MLA classifiers is that they can incorporate various potentially diverse types of information and establish correlations between multiple parameters. Thus, we can conceivably use information from a transient-identifying algorithm like KleineWelle or Omega for the fast channels and more slowly-varying baseline information about the detector subsystems into the same classifier. The rate of glitches in the fast channels coupling to the GW channel can depend on the value and rate of change/acceleration of the slow

10⁻⁵ 10⁻⁴ 10⁻³ 10⁻² 10⁻¹ 10⁰ False Alarm Probability

0.0 0.2 0.4 0.6 0.8 1.0

Efficiency

BurstDQcat1

BurstDQcat2 BurstDQcat3

ANN RFBDT

SVM OVL

Figure 6.9: Comparing the best performance for RFBDT (green), ANN (blue), SVM (red), and OVL (light blue) using the full S6 data sets to the application of the traditional data-quality flag vetoes for the burst search. BurstDQcat1 shows the efficiency at vetoing glitches in the GW channel with an SNR above 8 with Category 1 Burst data-quality flags applied. BurstDQcat2 shows the efficiency at vetoing glitches in the GW channel with an SNR above 8 with Category 1 and 2 Burst data-quality flags applied. BurstDQcat3 shows the efficiency at vetoing glitches in the GW channel with an SNR above 8 with Category 1, 2, and 3 Burst data-quality flags applied. The Burst data-quality flags were defined for the gamma ray burst search, which looks for excess power using the Omega algorithm (see Section 4.1.2). An SNR of 8 was chosen, because the threshold for KW significance for the GW channel was 35, which roughly translates to an SNR of 8.

The data-quality flags for the burst search are quite similar to the high-mass data-quality flags described in Section 4.2.1, except Burst Category 3 is like high-mass Category 4.

channels’ time-series in a complicated way. Machine learning should be able to automatically identify such non-linear correlations, even if they are not known previously.

Future work will also focus on tuning and applying the MLA and OVL classifiers to searches for GWs from specific astrophysical sources. As previously mentioned, using a MLA to quantify the glitchiness of the data allows us to provide a continuous rank,r_MLA ∈ [0,1], rather than a binary flag. The OVL classifier’s output can be also converted into a rank which, although by construction is discrete, is not a binary flag.

Future work will focus on an optimal way to fold this rank directly into searches for gravitational waves as a parameter characterizing a candidate event along with the rest of the data from the gravitational-wave channel, as opposed to a standard approach of vetoing entire segments of flagged data based on a hard threshold on data quality.

Chapter 7

Data analysis methods in the search for black hole binary systems

The search described in this thesis is an all-sky, all-time search for high-mass coalescing binaries in multi- detector data. Here, all-time means that we do not assumea prioriwhen a GW signal might be arriving at the detectors; we are only restricted by the calendar time of the data we are searching in. All-sky means that we do not presume from what area of the local universe a GW might be originating. The data looked at in this search, as described in Reference [17], are from LIGO’s sixth science run and Virgo’s second and third science run (S6-VSR2/3), which were during the calendar years of 2009-2010.

There have been many all-sky, all-time searches for both high-mass and low-mass coalescing binary systems. These searches have all relied on variation of the two-stage (sometimes calledhierarchicalin the literature) coincidence-based pipeline described in Section 7.3, the heart of which is a matched-filter for waveform templates. Starting with LIGO’s second science run (S2), a single-stage version of the pipeline was used to search for binary black hole systems with component masses between 3 and 20 M [114], and also for neutron star binary systems with component masses between 1 and 3M [115]. In S2, there was much less data to analyze and theχ²signal-consistency test was not used, so the second stage was not necessary. In S3, a search was conducted to specifically look for spinning binary black hole systems, using the Buonanno-Chen-Vallisneri method for waveform construction; more asymmetric sources (1< m1/M <3 and12< m2/M <20) were considered, as the precession effects of spinning systems are more apparent here [116]. The non-spinning inspiral-only searches in S3 and S4 data were presented in the same paper (Reference [117]), with component masses as small as .35M(primordial black holes) and a maximum total mass of 40Min S3 and 80Min S4. Notably, the results in Reference [117] were the first published with the effective SNR as the ranking statistic and the use of a two-stage pipeline like the one described in this chapter, the heart of which is described in Reference [90].

By S5, the analyses and publications started being split into low-mass and high-mass searches, because we finally had full inspiral-merger-ringdown waveforms for the high-mass systems. Prior to the inclusion of Virgo, two (for historical reasons) low-mass searches were published, each for systems with total mass

between 2 and 35M. Reference [118] covered the first year of operation during S5, which included 186 days of science data. Reference [56] covered the analysis of months 13-18 of operation, but the results included the first year as a prior, so the presented upper limits on the rate of CBCs were cumulative. A third paper presented the first combined analysis of joint LIGO-Virgo data in an all-sky search for CBCs. The analyzed data were the last 6 months of LIGO’s S5 and the 4 months of Virgo’s science run 1 (VSR1) [54], and the results from the first two S5 papers were used as a prior. It should be noted that there were four detectors to consider (H1, H2, L1, V1), creating 11 possible detector combinations with two or more detectors and presenting a new challenge for the collaboration. As a result, the inverse false alarm rate ranking system (see Section 7.6) was developed [55]. A similar analysis was carried out for S6 and VSR2 and VSR3, except H2 was not analyzed and H1, L1, and V1 had improved sensitivity [4]. The mass space for the S6-VSR2/3 low-mass search was restricted to less than 25 M so that there would be no overlap with the high-mass search, as the overlap could necessitate a trials factor [58].

There was only a single high-mass search published for S5, and it included only LIGO data [23] — Virgo’s noise profile during S5 was not very sensitive to the low frequencies that contain the most information about the coalescence of high-mass systems [119]. This search analyzed H1, H2, and L1 data for systems with total mass between 25 and 100M, and was the first to use full inspiral-merger-ringdown waveforms. A joint LIGO-Virgo search was performed for the S6-VSR2/3 (H1, L1, V1) data, again targeting systems with total mass between 25 and 100M [17]. The S6-VSR2/3 data were the last collected before the detectors were turned off to prepare for Advanced LIGO and Virgo. The specifics for the S6-VSR2/3 high-mass search are described in detail in the following sections.

It should be noted that all the searches just referenced are coincidence-based — the data from each detector are analyzed separately and candidate GW events are identified when triggers from two or more different detectors are coincident in time and matched to similar waveform templates. Theoretically, there is a superior method we can use to perform an all-sky, all-time search for coalescing binaries in multi-detector data: the coherent search. Coherent searches line up the data in the various detectors with appropriate time delay for different source locations on the sky, and find coherent triggers when the amplitude and phases line up [120]. At the time of analysis, we believed that coherent searches were overly computationally expensive and also unnecessary, since we could perform the coherent analysis on only the top events produced by the two-stage low-threshold coincident search described below. However, research in this area is ongoing.

7.1 The inputs to the search

The inputs to the search are the calibrated data from each detector from the times when it is in Science Mode; these are known asscience segments[63]. For the search outlined in this thesis, the total operating time for S6/VSR2-VSR3 data were split into 9analysis periods, each between∼4.4 and∼11.2 weeks long.

The divisions into these periods were sometimes based on an extended downtime due to an upgrade/repair of

the detector hardware or software, but were sometimes arbitrarily made to create manageably-sized analysis periods and to crudely capture the slowly time-varying detector performance between commissioning breaks.

Each analysis period is composed of the science segments for each detector. The beginning of each science segment is determined by the operator and Science Monitor, declaring, after lock acquisition and a moment for the violin suspension modes to damp down, that the detector’s data are adequate. The end of the science segment is determined by the detector going out-of-lock. The quality of the data in each analysis period is variable — this is where the veto segments come into play. We define Category 1 vetoes for data which are egregiously bad even in Science Mode, and only analyze data that pass the Category 1 vetoes. The Category 2, 3, and 4 vetoes are applied at the end of the analysis pipeline, effectively removing triggers from the already-analyzed Category 1 data. See Section 4.2.1 for a more in-depth discussion of the data-quality flags and vetoes. The amount of data in each analysis period at each category level is in Table 7.1; and total coincident (observation) time is given in Table 7.2.

Dalam dokumen The Search for Gravitational Waves from the Coalescence of Black Hole Binary Systems in Data from the LIGO and Virgo Detectors. Or: A Dark Walk through a Random Forest (Halaman 141-145)