Chapter 6: Nonresonant pair production of highly energetic Higgs bosons
6.4 The ππ π» π» analysis
6.4.1 Event selection
along the candidate subjet, and the jet has N (or fewer) subjets. For ππ β« 0, a large fraction of the jet energy is scattered away from the candidate subjet, which implies the jet has at least N+1 subjets. π
3 is expected to be small for top jets.
However, QCD jets can randomly have small values of π
3too, and to increase the discrimination power,π
3/π
2or π
32 is used as an identifier for top jets. Smaller the value ofπ
32, more likely it is that the jet is a top jet.
We requireπ
32 < 0.46 (with a mis-tag rate of 0.5%) in certain hadronic top control regions (described later in Sec.6.4.2.1), for selecting a very pure sample of top jets.
Simulated events in the top quark control regions are corrected in order to take into account differences in the top-tagging efficiency between data and MC.
tends to have a lowerπ
Xbbscore as compared to theπ»jet in VH background events and gets classified as the second jet, as shown in Figure6.9. Thus, the second jet mass has a significantly larger distinguishing power than the first jet mass.
60 80 100 120 140 160 180 200
AK8 Jet mass [GeV]
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Event 2018 HH MC
mass Jet1
mass Jet2
60 80 100 120 140 160 180 200
AK8 Jet mass [GeV]
0 5 10 15 20 25 30 35 40
Event 2018 VH MC
mass Jet1
mass Jet2
Figure 6.9: After pre-selection requirements, the distributions of the first and second jet mass for HH signal (left) and VH background (right) are shown in this figure.
The second jet in VH background events tends to be the W or Z boson instead of the Higgs boson.
Theπ
Xbb> 0.8 requirement helps to reduce the QCD jet contributions, resulting in a total background that is roughly 50% QCD multijet and 50%π‘ π‘ background. We also impose orthogonality to the VBF category (see Sec.6.5.1for description), by removing events which satisfy all of the following conditions :
β’ β₯ 2 AK4 jets(π), with π
T > 25 GeV and |π| < 4.7,
β’ the AK4 jets do not overlap with leptons, i.e., Ξπ (π , β) > 0.4, where β are muons (electrons) with π
T > 5(7)GeV, and,
β’ the two highest-π
TAK4 jets haveππ π > 500 GeV andΞππ π > 4.0.
Lastly, we designed a boosted decision tree (BDT) discriminator trained to enhance π» π» signal over QCD multijet and top quark background. We will define several event categories based on a 2D grid of the output score of the BDT discriminator as well as theπ
Xbbscore for the second jet. The BDT is described in the following section.
6.4.1.1 The BDT architecture and performance
To improve on the search sensitivity for the ππ π» π» analysis, we train a boosted decision tree (BDT) to discriminate between the HH signal events, and QCD and top quark background events. The input training variables for the BDT include the
Jet 1 and dijet system kinematics, the π
T of Jet 2, the π
32 values of both jets and the various ParticleNet scores for Jet 1. As mentioned in Sec. 6.4.1, our final fit categories will be based on a 2D grid of the output score of the BDT discriminator as well as theπ
Xbbscore for the second jet. Hence, we need to have zero correlation between the BDT output score and the Jet 2 π
Xbb score and we explicitly do not include the Jet 2 ParticleNet scores for training the BDT. The mass of the dijet system (ππ π) is calculated using the soft drop and regressed masses of jet 1 and jet 2, respectively. The full list of training variables is shown in Table6.3. The data and simulation comparisons of these input variables are shown later in Sec.6.4.1.2.
Table 6.3: List of input variables used for the BDT training.
Variable Definition πTπ π Dijet system π
T
ππ π Dijet systemπ ππ π Dijet system mass
πmiss
T Missing transverse energy π
π1
32 Jet 1π
32
π
π2
32 Jet 2π
32
π
π1
π π· Jet 1 soft-drop mass π
π1
T Jet 1 π
T
ππ1 Jet 1π π
π1
Xbb Jet 1π
Xbbscore π
π1
QCDb Jet 1ππΆ π·πscore π
π1
QCDbb Jet 1ππΆ π·π π score π
π1
QCDothers Jet 1ππΆ π·ππ‘ βππ π score π
π2
T Jet 2 π
T
π
π1
T /ππ π Jet 1 π
T over dijet system mass π
π2
T /ππ π Jet 2 π
T over dijet system mass π
π1 T /π
π2
T Jet 2 π
T / Jet 1π
T
The BDT is trained using the XGBoost method. 60% of the events were used for training and validation, and the remaining 40% were used for testing. The parameters used for the BDT training are as follows:
β’ Number of trees = 400
β’ Maximum depth = 3
β’ Learning rate = 0.1
β’ L2 regularization strength : 1.0.
We specifically ensured that none of the training variables induced any sculpting of the Jet 2 mass as tighter requirements on the BDT output score are imposed. In Figure6.10, we show the shape of the Jet 2 regressed mass in the QCD background simulation as the requirement on the BDT score is increased. The plot shows that the background has an exponentially falling shape for different selections cuts on the BDT output score, and no sculpting is observed.
50 100 150 200 250 300 350 400 450 500
Jet 2 Regressed Mass [GeV]
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14
Fraction of Events
BDT > 0.01 BDT > 0.03 BDT > 0.11 BDT > 0.25 BDT > 0.35 BDT > 0.43
Figure 6.10: The jet 2 mass distribution is shown for QCD multijet background from simulation samples as the requirements on the BDT output score is tightened.
No mass sculpting is observed here.
The discriminator output score shape for theππ π» π» signal and main backgrounds are shown in Figure 6.11. The backgrounds tend to peak at lower BDT score values, and the signal has a higher concentration at the larger BDT score values.
The performance of the BDT is shown in a ROC curve in Figure6.12, where we plot the signal efficiency against the background efficiency as one scans the requirement on the BDT output score. The background here consists of both theπ‘ π‘and QCD. To estimate the power of the BDT, we define a simple alternative cut based selection strategy (not used in this analysis) as follows:
β’ Jet 1 π
T > 350 GeV, Jet 2π
T > 310 GeV
β’ Jet 1 soft drop massβ[105, 135] GeV,
β’ Jet 2 regressed mass>50 GeV,
β’ Jet 1π
Xbb > 0.985.
0.0 0.2 0.4 0.6 0.8 1.0 Event BDT score
10 3 10 2 10 1 100 101
Arbitrary Unit
QCDttbar GluGluToHH
Figure 6.11: The ππ π» π» signal and main background distributions of the BDT output score are shown in this figure. The histograms are normalized to unity. The backgrounds tend to peak at lower BDT score values, and the signal has a higher concentration at the larger BDT score values.
To make a fair comparison against the cut-based analysis and the BDT based analysis, we compare the value of the signal efficiency at Jet 2π
Xbb> 0.98. We observe that at a signal efficiency equal to the cut-based analysis selection, the BDT has twice the background rejection power.
6.4.1.2 Data and MC comparisons of BDT input variables
The most important BDT variables are shown in Fig.6.13, in the hadronicπ‘ π‘control region (defined later in Sec.6.4.2.1). The data is well described by the simulation for these input variables.
6.4.1.3 BDT event categories
To obtain event categories with an optimized signal sensitivity, we scan across different values of the BDT output score and the π
Xbb score of the second jet, and compute the signal and background yields for Jet 2 mass in the range 120β
130 GeV. The total background prediction in the range 120β130 GeV is obtained using a simple linear interpolation of the data in the Jet 2 mass sidebands, i.e., [110,120] βͺ [130,140] GeV. We then choose those working points for the BDT score and the Jet 2π
Xbb, that yield the best expected upper limit on the SMππ π» π» cross section. We keep repeating this procedure, excluding the events that were
0 0.05 0.1 0.15 0.2 0.25
Signal Eff
0 0.0002 0.0004 0.0006 0.0008 0.001 0.0012 0.0014 0.0016 0.0018 0.002
Bkg Eff HH BDT
HH BDT > 0.43
Cut-Based
Figure 6.12: The ROC curve for the BDT is shown in this figure. The signal efficiency vs the background efficiency is plotted as one scans the requirement on the BDT output score. The black cross corresponds to the working point for the cut-based analysis selection (defined in text). The green cross corresponds to BDT score > 0.43, which is the requirement in the most optimized analysis category (defined in Sec. 6.4.1.3). The discontinuities observed here arise due to certain large weight events in the QCD simulation samples.
previously selected to form an event category, until the remaining events have negligible contribution to the total sensitivity.
The specific event categories obtained with this optimization procedure are mutually exclusive and summarized as follows:
β’ Category 1: Jet 2π
Xbb > 0.980 and BDT> 0.43
β’ Category 2: Jet 2π
Xbb >0.980 and 0.11 <BDT β€ 0.43, or Jet 2π
Xbb< 0.980 and BDT>0.43 (excluding category 1)
β’ Category 3: Jet 2π
Xbb β₯ 0.95 and BDT> 0.03 (excluding category 1 and 2).
The event categories are visually represented in Figure6.14. Event category 1 is the most sensitive, with a signal-to-background ratio of about 1:6, assuming SMππ π» π» cross section. Event category 2 is the second most sensitive, with a S/B of 1:33 and category 3 has a S/B ratio of about 1:200. The fail region in Fig.6.14is used
60 80 100 120 140 160 180 200 220 240 260 280 [GeV]
mSD
j1
0.0 0.5 1.0 1.5
Data / pred.
0 500 1000 1500 2000 2500 3000
Events / 10 GeV
Data tt+jets V+jets, VV
QCD Bkgd. stat. unc.
Data tt+jets V+jets, VV
QCD Bkgd. stat. unc.
CMSSupplementary CR
t t
(13 TeV) 138 fb-1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
b
Db
j1
0.0 0.5 1.0 1.5
Data / pred.
0 500 1000 1500 2000 2500 3000 3500 4000
Events / 0.05
Data tt+jets V+jets, VV
QCD Bkgd. stat. unc.
Data tt+jets V+jets, VV
QCD Bkgd. stat. unc.
CMSSupplementary CR
t t
(13 TeV) 138 fb-1
800 1000 1200 1400 1600 1800
[GeV]
mHH
0.0 0.5 1.0 1.5
Data / pred.
0 100 200 300 400 500 600 700 800 900
Events / 30 GeV
Data tt+jets V+jets, VV
QCD Bkgd. stat. unc.
Data tt+jets V+jets, VV
QCD Bkgd. stat. unc.
CMSSupplementary CR
t t
(13 TeV) 138 fb-1
0.2 0.3 0.4 0.5 0.6 0.7
/mHH T j1
p 0.0
0.5 1.0 1.5
Data / pred.
0 500 1000 1500 2000 2500
Events / 0.025
Data tt+jets V+jets, VV
QCD Bkgd. stat. unc.
Data tt+jets V+jets, VV
QCD Bkgd. stat. unc.
CMSSupplementary CR
t t
(13 TeV) 138 fb-1
Figure 6.13: The data and expected background distributions from simulation in theπ‘ π‘ control region (Sec.6.4.2.1) for some of the most discriminating BDT input variables, including the Jet 1π
SD(upper left) andπ
Xbb(upper right), ππ» π» (lower left), and π
T j1/ππ» π» (lower right). The lower panel shows the ratio of the data and the total background prediction, with its statistical uncertainty represented by the shaded band. The error bars on the data points represent the statistical uncertainties.
as a control region for QCD background estimation, and will be discussed later in Sec.6.4.2.2.