• Tidak ada hasil yang ditemukan

Monte Carlo Truth

HCAL ECAL

Chapter 5 Analysis

5.6 PHOSPHOR Fit

5.6.2 Monte Carlo Truth

The phosphor Fit relies on precise knowledge of the true values of the Eγ scale and resolution in the simulation. These are also necessary to validate the method.

Therefore, a robust and accurate method to extract these values is desirable.

The estimation of the Monte Carlo trueEγ scale and resolution would be straight- forward if theEγresponse followed a simple known density function, like the Gaussian or the Crystal Ball [122]. We could then simply fit it to the simulated Eγ response data and interpret its location and scale parameters as the Eγ scale and resolution.

Unfortunately, this does not seem to be the case.

In Figure 5.11 we fit the Crystal Ball line shape to theEγ response of photons in a subclass ofµµγevents. We select only photons in the ECAL barrel withET >30 GeV and R9 >0.94. The fitted function describes the data very poorly due to the tails.

The tails are much heavier in the data than the function can model, biasing the fit.

144

- 1 (%) E/Etrue

-40 -20 0 20 40 60 80

Events / ( 0.05 % )

10-2

10-1

1 10 10

= -0.3799 +/- 0.021 µ

= 1.603 +/- 0.015 σ

n = 5.1 +/- 7.0

- 1 (%) E/Etrue

-4 -3 -2 -1 0 1 2

Events / ( 0.05 % )

0 20 40 60 80 100

Pulls χ2

-6 -4 -2 0 2 4 6

trueBins of E/E

0 5 10 15 20 25 30 35

- 1 (%) E/Etrue

-12 -10 -8 -6 -4 -2 0 2 4

Pulls2χ

0 100 200 300 400 500

- 1 (%) E/Etrue

-12 -10 -8 -6 -4 -2 0 2 4

Residuals2χ

-80 -60 -40 -20 0 20 40

Crystal Ball Fit 99% Containment Barrel

> 0.94

γ

R9

[30, 100) GeV

γ ET

5867 events /ndof: 3.09e+05/85 χ2

p-value: 0

Figure 5.11: An example of Eγ response mismodeling by the CB line shape. These events have photons in the ECAL barrel with ET >30 GeV and R9 > 0.94. The fit range contains 99% of all events. (Top Left) The data and the fitted function on a logarithmic y-axis scale with an x-axis range covering 99.99994% (5-σ) of the data. (Top Middle) The data and the fitted function on a linear y-axis scale with an x-axis range covering 95.4% (2-σ) of the data.

145

- 1 (%) E/Etrue

-40 -20 0 20 40 60 80

Events / ( 0.05 % )

10-2

10-1

1 10

- 1 (%) E/Etrue

-4 -3 -2 -1 0 1 2

Events / ( 0.05 % )

0 20 40 60 80 100

Pulls χ2

-6 -4 -2 0 2 4 6

Bins of E/E

0 5 10 15 20

- 1 (%) E/Etrue

-1 -0.5 0 0.5 1

Pulls2χ

-2 -1 0 1 2 3

- 1 (%) E/Etrue

-1 -0.5 0 0.5 1

Residuals2χ

-20 -10 0 10 20

Gaussian Fit 70% Containment Barrel

> 0.94

γ

R9

[30, 100) GeV

γ ET

5867 events /ndof: 32.9/48 χ2

p-value: 0.95

Figure 5.12: An example of a good modeling of the Eγ response peak by a Gaussian. (Top Left) The data and the fitted function on a logarithmic y-axis scale with anx-axis range covering 99.99994% (5-σ) of the data. (Top Middle) The data and the fitted function on a linear y-axis scale with an x-axis range covering 95.4% (2-σ) of the data. This fit demonstrates the impact of limiting the range on the goodness of the Eγ response modeling. This is to be compared with the reference fit in Figure 5.11. Here, the data is the same (photons in the ECAL barrel with ET > 30 GeV and R9 > 0.94), the model is even simpler (Gaussian instead of the CB line shape) yet — in contrast to the reference fit — the model describes the data reasonably well. The important difference is that the fit range is a modal interval containing only 70% of all events instead of 99%.

To judge the goodness of a fit, we plot the χ2 residuals and pulls as a function of the observable x. Theχ2 residuals are defined for each bin i= 1, . . . , N as follows:

∆ni =ni−νi, (5.9)

where ni and νi are the numbers of the observed and expected events in the bin i, respectively. Here, the number of the expected events events in the bini is:

νi = XN j=1

nj

Z bi

ai

f(x|θ) dx, (5.10)

where we sum j over all the bins. Here, f(x|θ) is the fitted model depending on P parameters θ =θ1, . . . , θP, and ai and bi are the lower and upper boundary of the bin i= 1, . . . , N, respectively. We choose the binning such thatni ≥30 for∀i. This guaranties that νi >5 for ∀i at a very high confidence level. We define theχ2 pulls as:

χi = ni−νi

√νi

(5.11) High values of the χ2 residuals and pulls indicate poor compatibility of the model f(x|θ) with the data. For each bin, we plot ∆ni and χi at the median x of that bin.

We also plot the distribution of the χ2 pulls. This should follow a unit Gaussian if the f(x|θ) describes the data well:

χi ∼ N(x|0,1). (5.12)

Therfore we also overlay the spectrum of χ2 pulls with a properly normalized unit Gaussian to see their mutual compatibility.

As another goodness-of-fit test, we calculate the χ2 statistic as [123]:

χ2 = XN

i=1

χ2i, (5.13)

where we sum the index i over all bins. If the data follows the model f(x|θ), the χ2 statistic approaches a known probability density function (PDF), the so-calledχ2 PDF f(z|nd). Here, nd is the number of degrees of freedom:

nd =N −P. (5.14)

The χ2 statistic follows the χ2 PDF in the limit of high statistic. In practice, the χ2 PDF is a good approximation of the actual χ2 distribution when νi > 5 for all i = 1, . . . , N [123]. This condition is satisfied by our choice of the binning. We can thus use the χ2 PDF to calculate the p-value of theχ2 statistic as:

p= Z

χ2

f(z|nd) dz. (5.15)

The p-value expresses the probability that the χ2 statistic of a random sample would atain a greater value than theχ2 statistic of the sample at hand. Thep-value should be uniformly distributed. Poor compatibility of the modelf(x|θ) with the data leads to low numerical values of the p-value.

148

- 1 (%) E/Etrue

-80 -60 -40 -20 0 20 40 60 80 100

Events / ( 0.5 % )

1 10 102

= 11.86 +/- 0.17 µ

= 8.70 +/- 0.21 σ

- 1 (%) E/Etrue

-20 -10 0 10 20 30 40

Events / ( 0.5 % )

0 20 40 60 80 100 120 140 160 180

Pulls χ2

-6 -4 -2 0 2 4 6

- 1trueBins of E/E

0 2 4 6 8 1012 14 16 18 20 22

- 1 (%) E/Etrue

0 5 10 15 20

Pulls2χ

-4 -3 -2 -1 0 1 2 3

- 1 (%) E/Etrue

0 5 10 15 20

Residuals2χ

-30 -20 -10 0 10 20 30 40

Gaussian Fit 70% Containment Endcaps

< 0.95

γ

R9

[10, 12) GeV

γ ET

8141 events /ndof: 73.9/47 χ2

p-value: 0.0074

Figure 5.13: An example of the Eγ response peak mismodeling by a Gaussian. These events have photons in the ECAL endcaps with ET =10–12 GeV and R9 < 0.94. The fit range contains 70% of all events. (Top Left) The data and the fitted function on a logarithmic y-axis scale with an x-axis range covering 99.99994% (5-σ) of the data. (Top Middle) The data and the fitted function on a linear y-axis scale with an x-axis range covering 95.4% (2-σ) of the data.

149

- 1 (%) E/Etrue

-80 -60 -40 -20 0 20 40 60 80 100

Events / ( 0.5 % )

1 10 10

= 8.70 +/- 0.20 σ

n = 0.1 +/- 8.7

- 1 (%) E/Etrue

-20 -10 0 10 20 30 40

Events / ( 0.5 % )

0 20 40 60 80 100 120 140 160

Pulls χ2

-6 -4 -2 0 2 4 6

Bins of E/E

0 2 4 6 8 1012 14 16 18 20

- 1 (%) E/Etrue

0 5 10 15 20

Pulls2χ

-4 -3 -2 -1 0 1 2 3

- 1 (%) E/Etrue

0 5 10 15 20

Residuals2χ

-30 -20 -10 0 10 20 30 40

Crystal Ball Fit 70% Containment Endcaps

< 0.95

γ

R9

[10, 12) GeV

γ ET

8141 events /ndof: 73.9/45 χ2

p-value: 0.0043

Figure 5.14: An example of the Eγ response peak mismodeling by a CB line shape. Same as Figure 5.13 but for the CB line shape instead of a Gaussian.

150

- 1 (%) E/Etrue

-80 -60 -40 -20 0 20 40 60 80 100

Events / ( 0.5 % )

1 10 102

s = 13.77 +/- 0.44

L = 10.30 +/- 0.45 σ

R = 6.39 +/- 0.46 σ

- 1 (%) E/Etrue

-20 -10 0 10 20 30 40

Events / ( 0.5 % )

0 20 40 60 80 100 120 140 160 180

Pulls χ2

-6 -4 -2 0 2 4 6

- 1trueBins of E/E

0 2 4 6 8 1012 14 16 18 20 22

- 1 (%) E/Etrue

0 5 10 15 20

Pulls2χ

-3 -2 -1 0 1 2 3

- 1 (%) E/Etrue

0 5 10 15 20

Residuals2χ

-30 -20 -10 0 10 20 30 40

Bifur. Gaussian Fit 70% Containment Endcaps

< 0.95

γ

R9

[10, 12) GeV

γ ET

8141 events /ndof: 52.7/46 χ2

p-value: 0.23

Figure 5.15: An example of the Eγ response peak mismodeling by a bifurcated Gaussian. Same as Figure 5.13 but for a bifurcated Gaussian instead of a Gaussian.

The poor modeling of the tails can be mitigated by fitting a subset of the data near the peak, see Figure 5.12. However, reducing the fit range leads to additional systematics due to the fit range and is very fragile since the behavior varies greatly among various photon categories based on the photon pT, η and R9. Figures 5.13–

5.15 demonstrate the limitation of this approach for photons in the endcaps with ET ∈[10,20] GeV and R9 <0.94, for increasingly complex analytical models.