Chapter III: Supersymmetry and searches at the LHC
4.4 Design of a multipurpose scouting framework for Run II
Calo-scouting event format
The event format for scouting with calo jets is depicted on the left side of Figure 4.4.
It consists of the following components:
• Calo jets objects, which hold the momentum vectors of the reconstructed calo jets as well as auxiliary information, such as the fraction of the jet energy contained in the ECAL and in the HCAL.
• The magnitude and angle ofp~Tmiss, computed using calorimeter-level quanti- ties.
• The value ofρ, the medianpT density in the event, computed using calorimeter- level information.
• Primary vertex objects, each holding the coordinates and associated uncer- tainties of a reconstructed primary vertex.
Events with HT > 250 GeV saved in this format have an average size of approxi- mately 1.5 kB.
In the Run I data scouting analysis using calo jets, there was no dedicated scouting event format: the CMS software objects corresponding to the reconstructed calo jets were saved with no reformatting. The average event size in that analysis was 10 kB [100]. The much smaller event size under the new Run II framework indicates that repacking the jets into dedicated scouting jet objects significantly decreases the storage overhead.
PF-scouting event format
The event format for scouting with PF objects is shown on the right side of Fig- ure 4.4. It consists of the following components:
• The collection of PF candidates with pT > 0.6 GeV. Each PF candidate is represented by its 4-momentum and an integer indicating its particle type.
• PF jet objects, which hold the momenta and identification variables of jets clustered from the PF candidates.
Figure 4.4: Schematics of the calo-scouting (left) and PF-scouting (right) event formats. ‘MET’ refers to ~pTmiss, ρ is a measure of the median pT density in the event, and ‘AK4’ indicates the anti-kT algorithm with R=0.4.
• Primary vertex objects, identical to those included in the calo-scouting event content.
• The magnitude and angle ofp~Tmiss, and the value of ρ.
• Electron, muon, and photon objects, each associated with a 4-momentum vector and a collection of identification variables.
The typical PF-scouting event size is 10 kB, most of which is occupied by the PF candidate objects. The inclusion of the PF candidates allows for more complex analysis strategies involving, for example, jet substructure variables computed using the constituents of the jets.
HLT data streams for scouting
Two scouting data streams were implemented in the HLT software, referred to as the PF-scouting and calo-scouting streams. Each stream has its own set of primary datasets and trigger paths, and its own output data format as discussed above.
Scouting trigger paths have the same basic structure as standard HLT paths (see Figure 4.1), except that after the last stage of event reconstruction there is no final event selection filter. Instead, the reconstructed physics objects are passed to a
software module that packs them into one of the special scouting formats. The objects are then written to disk in files grouped by PD.
The hadronic scouting triggers designed and deployed in 2015 were inspired by the scouting triggers used in Run I. They select events based on the value of calorimeter- levelHT, usingHT thresholds significantly lower than those of the paths in the stan- dard HLT menu. The triggers are seeded by L1 triggers that select events based on the value of HT reconstructed at the L1. The thresholds of theHT scouting triggers are illustrated in Figure 4.5, and more information about the choice of thresholds is provided in the following subsections.
Figure 4.5: Diagram illustrating the HT thresholds of the hadronic scouting and parking triggers deployed in Run II. Note that the thresholds indicated for the PF- scouting and parking triggers correspond to the triggers used in 2016; in 2015 the thresholds were set to 450 GeV. TheHT trigger in the standard HLT menu (indicated in purple for comparison with the scouting triggers) was tightened from 800 GeV to 900 GeV in 2016.
An important idea in trigger design is that of a triggerturn-oncurve, which quan- tifies the fraction of desired events that the trigger successfully selects. Turn-on curves will be shown later in the context of a dijet resonance search with data scout- ing. Additional triggers are included in the scouting data streams to facilitate the measurement of turn-on curves; these are paths with looser selection requirements that provide a baseline for the efficiency measurement. The triggers used to measure turn-on curves are prescaled by large factors to avoid excessive event rates.
The PF-scouting stream
The list of trigger paths in the PF-scouting stream is provided in Table 4.1. The rate at which these paths select events is low enough that the PF algorithm can be run for each event. The reconstructed PF objects are used to produce output events in the PF-scouting format described above. In addition to the standard PF sequence, additional reconstruction sequences are run to compute isolation sums and other needed identification variables for muons, electrons, and photons.
The triggers in the PF-scouting stream each come in two versions: one that per- forms b-tag reconstruction and one that does not. The two triggers in each pair have identical event selection requirements. The b-tagging version of each trigger runs the CSVv2 algorithm (see Section 2.4) to predict whether each reconstructed jet is a b-jet. This functionality is placed in a separate path for a technical rea- son: the HLTb-tag reconstruction sequence contains filters that reject some events (specifically those that have no jets or no reconstructed vertices). By putting theb- tagging sequence outside the main trigger path, we avoid spuriously rejecting these events. The rates in Table 4.1 refer to the version of each path that does not run b-tagging; the rate of the b-tagging version is generally slightly lower due to the filters associated with theb-tag sequence.
To facilitate the measurement of trigger turn-on curves for the scouting triggers, additional paths are included that have the same L1 trigger requirements as the main scouting paths, but no HLT requirement at all. These L1-only triggers provide a baseline for measuring trigger efficiencies. That is, they allow the calculation of the trigger efficiency as
H LT = # passing L1 and HLT
# passing L1 . (4.2)
This measurement is typically performed in bins of a kinematic variable of interest, such asHT or the invariant mass of the two leading jets. Because of the extremely high rate of the L1 seeds, the L1-only scouting triggers are prescaled. The prescale factor is chosen to achieve a rate of order 10 Hz per path.
In fact, the triggers in this stream have event selection thresholds so low that the L1 trigger paths seeding them may not be fully efficient for the selected events. That is, there are events that would be selected by the scouting HLT paths but which are prematurely rejected at the L1 stage. To quantify this effect, an additional minimum-bias trigger path is included that has a pass-through L1 seed (which is
heavily prescaled but accepts every event it sees). For this trigger, an extremely loose selection is applied at the HLT level that demands only the presence of a sin- gle jet in the event. This trigger can be used in conjunction with the L1-only triggers just described to obtain a measurement of the L1 seed efficiencies:
L1 = # passing L1 and min-bias
# passing min-bias . (4.3)
The PF-scouting stream also contains a trigger that selects events having two muons, with no requirement on jet activity in the event. The muons are required to have pT > 3 GeV and to have di-muon invariant massmµµ > 10 GeV. A L1-only muon trigger path, and a zero-bias path accepting every event it sees, are added to the stream to assist with efficiency measurements for this trigger. Theµµscouting trig- ger was implemented as a proof-of-concept of a non-hadronic scouting trigger. A new di-muon scouting trigger was developed for the 2017 run, motivated by the use case of a search for dark photons. This trigger is described in Section 4.7.
Trigger Rate in 2015 (Hz) Rate in 2016 (Hz) Rate in 2017 (Hz) Notes
HT>410 GeV - 750 740 Deployed in 2016
HT>450 GeV 160 500 - Removed in 2017
mµ µ >10 GeV 200 530 - Removed in 2017
L1HT 7 8 40
L1 di-muon 23 4 10
Min-bias 3 5 30
Zero-bias 16 9 10
Table 4.1: List of paths in the PF-scouting stream, with typical rates in the 2015, 2016, and 2017 LHC runs. Rates are computed for instantaneous luminosities of 5×1033, 1×1034, and 1.5× 1034 cm−2s−1respectively for 2015, 2016, and 2017.
The rates for the L1-only and min/zero-bias paths are controlled by prescale factors that change from year to year.
The calo-scouting stream
The list of trigger paths in the calo-scouting stream is provided in Table 4.2. Trig- gers in this stream reconstruct calo jets, which are used to produce output events in the calo-scouting format described above. Because it is virtually costless to re- construct calo jets, the rates of these triggers can be much higher than those of the PF-scouting triggers.
The calo-scouting triggers do not perform primary vertex reconstruction. However, primary vertices can be saved as part of the calo-scouting event content if they are
available in the event. When another HLT path runs primary vertex reconstruction (usually as part of some other physics reconstruction sequence), the vertex objects are saved in the output scouting event. Having vertex information available in a fraction of the saved events can be useful in analysis. For example, it can be used to monitor changes in physics object traits as a function of the number of primary vertices.
Calo jets are reconstructed without using information from the CMS tracker, sob- tag information cannot be obtained for them with the usual HLT CSV sequence.
Instead, a modified version of theb-tagging sequence is deployed. This method be- gins by performing the fast primary vertex reconstruction described in Section 2.3.
Regional tracking is then performed using only hits consistent with the identified primary vertex and with the locations of the 8 highest-pT jets in the event. This collection of tracks is used as input to the CSVv2 algorithm for b-jet identifica- tion [23]. As in the PF-scouting triggers, the b-tagging sequences in calo-scouting need to be placed in separate paths in order to avoid unnecessarily rejecting events.
Regional tracking can also be used to reject calo jets originating from pileup or noise. Pixel tracks are reconstructed inside of selected calo jets, and the variable
rtracks ≡ P
trackspT,track
pT,jet , (4.4)
i.e., the ratio of the track momentum to the jet momentum, is computed. The dis- tribution of this ratio is shown in Figure 4.6 for signal and pileup jets. Usingrtracks allows one to reject 75% of pileup jets while retaining 95% of signal jets [102].
This variable is saved in the calo-scouting events for use by analyzers.
Pixel track reconstruction and calo jetb-tagging were not included in the scouting framework for the 2015 LHC run. They were added in 2016, after the successful use of the 2015 calo-scouting data by the dijet resonance search team (described in the next chapter).
As for the PF-scouting stream, the calo-scouting stream contains additional triggers to assist in the measurement of trigger turn-on curves. These are exact copies of the L1-only and minimum-bias trigger paths described in the previous section, except that they reconstruct calo jets instead of running the full PF sequence. Because they are the same as those listed in Table 4.1, they are omitted from Table 4.2.
Figure 4.6: Distribution of the quantity rtracks for signal (blue) and pileup (red) jets [102].
Trigger Rate in 2015 (Hz) Rate in 2016 (Hz) Rate in 2017 (Hz) Notes
HT>250 GeV 1500 4000 3900
Di-muon trigger - - 1900 Deployed in 2017
Table 4.2: List of paths in the calo-scouting stream, with typical rates in the 2015, 2016, and 2017 LHC runs. Duplicates of triggers in the PF-scouting stream (Ta- ble 4.1) are not shown. Rates are computed for instantaneous luminosities of 5×1033, 1×1034, and 1.5× 1034 cm−2s−1respectively for 2015, 2016, and 2017.
Scouting data monitoring
A third stream is implemented at the HLT with the purpose of monitoring the quality of the scouting triggers and events. This monitoring stream is a hybrid of a stan- dard physics stream and a data scouting stream. Selected events undergo standard prompt reconstruction and are also saved in the scouting format. The dual output allows for side-by-side comparison of the scouting physics objects with standard reconstructed objects. It can be used to calibrate the energies and identification ef- ficiencies of the trigger objects, and also to identify any problems with the stream configurations or the event packing software.
The scouting monitor stream contains copies of all of the triggers in the PF-scouting and calo-scouting streams. Because the output undergoes prompt reconstruction, the rate of the stream must be kept low. Prescales on the triggers in the scouting monitor stream are chosen in order to keep the total rate of the stream below about 30 Hz. The stream can then be used to study the scouting data without having a
major impact on the HLT data volume.