Anda di halaman 1dari 47

Analysis of HH

to bbZZ to 2b 2l 2nu
Rami Kamalieddin, Ilya Kravchenko (UNL)
Michele de Gruttola (CERN)
Lesya Shchutska (ETH)

Big thanks for useful discussions to


Giacomo Ortona, Luca Cadamuro, Benjamin Stieger

HH meeting, 23 May 2017


2

OUTLINE
Analysis signature, strategy

Analysis

Complicated issue that we face

Possible solutions tried

Future plans

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


3

HH bbZZ 2b 2l 2nu

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


4

2b 2l 2 signature
MET from the off shell Z* boson

H ZZ*

2 leptons (eles, muons) from on shell Z

H bb
2 b-jets from the Higgs bb

In our analysis we have HH decay through:


H to bb to give the highest BR
H to ZZ*, with Z to ll to give important handles,
which improve sensitivity
* Pic based on the plot from bbWW team
Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017
5

Analysis strategy

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


6

Analysis strategy

Object Build Prepare input Limits


selection Candidate to Combine

Requirements Simple common Simple shape: Final target


on leptons, jets sense cuts: HH trans.mass, is to derive
and met before Nb-jets, mbb, ZH mt2, etc the best
constructing a Nlep, mll, MET limits we
candidate OR can
OR

BDT with the BDT with all the


above variables possible
kinematic and
other variables

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


7

Goal of the analysis

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


8

Our objective

Our objective is:


Set 95% CL limit on
(pp X hh) BR(hh bbZZ bbllmet ) vs mass of X

The method we will use to achieve it:


Binned shape analysis using Higgs Combination Tool

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


9

Shape analysis, cont.


Questions that we need to address:

Which distribution to use?


Plenty of options.
Ideally use the one that gives the best limit.

How difficult is to construct this shape?


What could be possible complications at the
pre-approval/approval stage

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


10

Shape analysis, cont.


Questions that we need to address:

Which distribution to use?


Plenty of options.
Ideally use the one that gives the best limit.

How difficult is to construct this shape?


What could be possible complications at the
pre-approval/approval stage

Keeping in mind the latter,


lets explore in full depth the first question.
Any suggestions are highly appreciated.
Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017
11

HH transverse mass shape

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


12

Simplest shape
Possibilities:

1. HH_mt is the simplest


common sense option:

+ Easy to construct this variable

We cut on five different variables before


we build HH_mt. Thus, those five cuts
have to be optimized.

Run Punzi significance or alike Use MVA methods (LD, BDT, DNN),
on 55 cuts (vary each cut at then simply cut on the response of
least twice down and up) the produced MVA distribution

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


13

BDT shape

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


14

BDT shape
Possibilities:

2. BDT response as an input shape to Higgs Combination Tool

Which variables to use?


We have 58 variables: pt, eta, phi, mt, mt2, dEta, dPhi, dR for b-jets,
leptons, Z, met, Hs and HH.
Discriminating power varies with the mva method.

How to address correlations?


Assume BDT will take care of it, a.k.a. do nothing
Apply decorrelation transformation prior to training

What to do with variables used to define control regions (CR):


Use for CRs, but exclude from BDT set of variables
(we define CRTT inverting Zll mass, and CRTT inverting Hbb_mass)
Use in both BDT and to define CRs
Use in BDT, but define new CRs

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


15

TWO different BDTs

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


16

TWO different BDTs


Possibilities:

3. Two different (by purpose) BDTs:


One BDT with common sense variables (e.g., #b-jets, Zll and Hbb mass)
to construct the best promising variables (e.g., HH_mt, ZH_mt2, etc)
Another BDT is to utilize these best promising variables in order to
create the shape for Higgs Combination Tool

+ May be the best option performance-wise

Difficulty validating them?

Which goes where:


common sense variables vs variables to define CRs.
variables to define CRs vs promising variables variables

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


17

N-dimensional BDT

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


18

N-dimensional BDT
4. N-dimensional BDT shapes mapped into 1D hist for Higgs Combination Tool

+ Similar procedure is used in tHq/ttH (2D into 1D)

1D shape has to be optimized too

Complexity of problem how many BDTs we need?


Two dominant backgrounds (TT and DY)
Signal like BG (SM ZH)
Signal MC samples cover 10 points from 260 to 1000 GeV

Optimize 10 signal Optimize:


mass points low mass (260 and 270 GeV)
at mass (@ the peak of DY and TT,
300-450 GeV)
Optimize low One parametrized
mid mass (600 and 650 GeV)
and high mass learning training
high mass (900 and 1000 GeV)
regions similar to bbWW 4D BDT shape can be constructed,
(see it as 3D shape wrapped into heat map)
* Pic from the tHq analysis then translated into 1D histogram
Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017
19

The scheme we chose

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


20

BDT shape for Combine


Optimize BDT with all 58 variables, loose selection on the next slide
Use BDT shape for Higgs Combination tool

+ Sounds like the simplest option

Type of training?
Binary classification (S vs total BG)
MultiClass classification (S vs TT vs DY)

Low statistics for signal, need specific bbZZ samples


For now stitch samples in four regions:
low mass (260 and 270 GeV)
at mass (@ the peak of DY and TT, meaning 300-450 GeV)
mid mass (600 and 650 GeV)
high mass (900 and 1000 GeV)

Check what gives the best ROC integral


Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017
21

Loose selection for BDT training

Apply almost no cuts to give BDT as much info as possible.


Few cuts had to be applied to avoid NaNs

HH common sense preselection:


Dilepton (ee/) mass > 50 GeV
2 Jets: pt > 30 GeV and || < 2.4

75 GeV < Hbb mass < 175 GeV

56 GeV < Dilepton mass < 126 GeV

Transverse mass of HH > 100 GeV

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


22

Binary classification:
S vs total BG

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


23

Binary classification
Binary classification: S vs total BG (TT+DY) of 58 vars
+ Fast
Information about correlations among BG samples is lost
260+270 GeV 300-450 GeV 600+650 GeV 900+1000 GeV

ROC integral for LD = 0.723!


ROC integral for BDT = 0.917!

ROC integral for LD =0.903!


btag0 in all sets ROC integral for BDT = 0.921! ROC integral for LD = 0.981! ROC integral for LD = 0.996!
ROC integral for BDT = 0.992!
low mass vs other sets ROC integral for BDT = 0.997!

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


24

Binary classification, low mass


Input variable: zhmass Input variable: mt2_b1l1b2l2met Input variable:
Input variable: mt2_b1l2b2l1met
btag0 Input variable: btag1 Input variable: nbjets
(1/N) dN / 28.1

Signal 5

(1/N) dN / 21.9

(1/N) dN / 0.128
/ 0.0205

(1/N) dN / 0.0205
/ 21.4
0.016 0.014 0.014
30
Background 6
0.014

0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%

(0.0, 0.1)%
0.012 0.012
4

dNdN
25 5
0.012

0.0)% // (0.0,
0.01 0.01

(1/N)
(1/N)
0.01 20 4 3

(0.0, 0.0)%
0.008 0.008
0.008 15 3

(S,B): (0.0,
0.006 0.006 2
0.006

U/O-flow (S,B):
0.004 10 2
0.004
0.004
1

U/O-flow
0.002 0.002 0.0025 1

0 0 00 0 0
200 400 600 800 1000 1200 100 200 300 400 500 600 700 800 0.32000.43000.54000.65000.76000.8
100 7000.9
800 1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5
zhmass mt2_b1l1b2l2met btag0
mt2_b1l2b2l1met btag1 nbjets
Input
Input variable:
variable: mt2_b1l1b2l2met
min_mt2_blmet Input variable: mt2_b1l2b2l1met Input variable: dR_lb_min Input variable: dR_ZH Input variable: dPhi_lb_min
(1/N) dN / 21.9

(1/N) dN / 21.4

0.014 0.014 1.4

(1/N) dN / 0.103
0.8

(1/N) dN / 0.0794
(1/N) dN / 0.26
(1/N) dN / 16.6

1.6
0.018
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%


0.012 0.012 1.4 0.7 1.2

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%

0.016
0.0140.01 0.01 1.2 0.6 1
0.012
0.008 0.008 1 0.5
0.8
0.01 0.4
0.006 0.006 0.8
0.008 0.6
0.6 0.3
0.004
0.006 0.004
0.4
0.4 0.2
0.004
0.002 0.002
0.2 0.1 0.2
0.002 0 0
1000 1200 100 200 300 400 500 600 700 800 100 200 300 400 500 600 700 800
0 0 0 0
zhmass 100 200 300 400 mt2_b1l1b2l2met
500 600 mt2_b1l2b2l1met 0.5 1 1.5 2 2.5 3 3.5 4 4.5 2 4 6 8 10 0 0.5 1 1.5 2 2.5 3
min_mt2_blmet dR_lb_min dR_ZH dPhi_lb_min

Correlation Matrix (signal) Correlation Matrix (background)


U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%

Linear correlation coefficients in % Linear correlation coefficients in %


100 100
min_mt2_blmet -16 -10 -14 29 14 11 71 61 62 100 min_mt2_blmet -2 -2 -4 56 37 29 90 83 84 100
80 80
mt2_b1l2b2l1met -12 -14 -16 9 16 4 70 5 100 62 mt2_b1l2b2l1met -3 -3 -5 48 35 25 87 59 100 84
60 60
mt2_b1l1b2l2met -12 -6 -7 11 12 4 61 100 5 61 mt2_b1l1b2l2met -3 -4 -5 47 35 25 87 100 59 83
40 40
500 600
zhmass -11 -11 -14 14 19 5 100 61 70 71 zhmass -6 -7 -9 56 40 29 100 87 87 90
min_mt2_blmet
20 20
dPhi_lb_min -2 2 40 18 100 5 4 4 11 dPhi_lb_min 56 3 100 29 25 25 29
0 0
dR_ZH -5 -5 -6 25 100 18 19 12 16 14 dR_ZH -17 -19 -19 39 100 3 40 35 35 37
20 20
dR_lb_min -7 -14 -12 100 25 40 14 11 9 29 dR_lb_min -6 -7 -8 100 39 56 56 47 48 56
40 40
nbjets 65 79 100 -12 -6 -14 -7 -16 -14 nbjets 79 78 100 -8 -19 -9 -5 -5 -4
60 60
btag1 44 100 79 -14 -5 2 -11 -6 -14 -10 btag1 65 100 78 -7 -19 -7 -4 -3 -2
80 80
btag0 100 44 65 -7 -5 -2 -11 -12 -12 -16 btag0 100 65 79 -6 -17 -6 -3 -3 -2
100 100
btag btag nbje d d d z m m m btag btag nbje d d d m m m
0 1 ts R_lb_mR_ZH Phi_lb_hmass t2_b1l1t2_b1l2in_mt2 0 1
z
ts R_lb_mR_ZH Phi_lb_hmass t2_b1l1t2_b1l2in_mt2
in min b2l2 b2l1 _blm in min b2l2 b2l1 _blm
met met et met met et

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


25

Binary classification, low mass, cont.


Background rejection versus Signal efficiency
1

Background rejection
0.9

ROC integral ! 0.8

for LD = 0.723! 0.7

! 0.6

ROC integral ! 0.5

for BDT = 0.917! 0.4 MVA Method:


BDT
0.3 LD

0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Signal efficiency
TMVA overtraining check for classifier: LD TMVA overtraining check for classifier: BDT
(1/N) dN / dx

(1/N) dN / dx
Signal (test sample) Signal (training sample) Signal (test sample) Signal (training sample)
90
Background (test sample) Background (training sample) 14 Background (test sample) Background (training sample)
80 Kolmogorov-Smirnov test: signal (background) probability = 0.049 (0.168) Kolmogorov-Smirnov test: signal (background) probability = 0.409 ( 0)
12
70

60 10
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


50 8
40
6
30
4
20

10 2

0 0
0.04 0.02 0 0.02 0.04 0.06 0.8 0.6 0.4 0.2 0 0.2 0.4
LD response BDT response

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


26

Binary classification, at mass


Input variable: bpt0 Input variable: zmass Input variable: hmass1
0.025
Correlation Matrix (signal)
(1/N) dN / 9.71

Signal

(1/N) dN / 1.79

(1/N) dN / 2.56
0.02 0.14
0.018
Background

U/O-flow (S,B): (0.0, 0.0)% / (0.4, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


0.12 0.02
0.016 Linear correlation coefficients in %
0.014 0.1 100
mt2_ZHmet 25 6 89 14 23 -8 -6 7 4 11 16 -7 -5 9 100
0.015
0.012
0.08 mt2_llmet 2 -11 87 -28 -4 1 1 12 24 -10 1 1 -1 100 9
0.01 80
0.06 0.01 dR_ZH -34 5 -2 -4 6 20 -3 -7 -16 -15 -21 -35 -7 18 100 -1 -5
0.008
0.006 0.04
dEta_lb_min -4 4 5 11 -3 -5 -4 -5 -5 -8 -3 100 18 1 60
0.004 0.005 nbjets 3 4 -6 3 -15 62 80 -2 -3 -4 9 100 -3 -7 1 -7
0.02
0.002 hpt1 89 -4 17 1 -12 -76 7 8 16 16 22 100 9 -8 -35 16
40
0 0 0
50 100 150 200 250 300 350 400 60 70 80 90 100 110 120 80 100 120 140 160 hpt0 20 -1 10 -8 -65 -17 -6 -6 70 81100 22 -4 -5 -21 -10 11
bpt0 zmass
20
hmass1 zpt0 13 13 2 24 -79 -16 -8 -4 86 100 81 16 -3 -5 -15 24 4
Input variable: hToZZ_mt_cosine Input variable: dR_leps Input variable: dR_bjets leppt0 14 20 5 19 -49 -12 -9 -3 100 86 70 16 -2 -4 -16 12 7 0
1 btag1 4 3 -4 1 3 -14 46 100 -3 -4 -6 8 80 -5 -7 1 -6
(1/N) dN / 11.1

(1/N) dN / 0.113

(1/N) dN / 0.116
0.8 20
0.012 btag0 3 1 -7 4 -14 100 46 -9 -8 -6 7 62 -3 -3 1 -8
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


0.7
0.8 dR_bjets -50 4 24 -5 16 100-14 -14 -12 -16 -17 -76 -15 11 20 -4 23
0.01 0.6 40
dR_leps -8 10 -22 100 16 4 3 -49 -79 -65 -12 3 5 6 -28
0.008 0.5 0.6
hToZZ_mt_cosine 5 -10 100-22 -5 1 19 24 -8 1 -4 87 14 60
0.4
0.006 hmass1 26 -2 100 -10 24 -7 -4 5 2 10 17 -6 -2 -11 89
0.4
0.3 zmass 80
0.004 -6 100 -2 5 10 4 1 3 20 13 -1 -4 4 4 5 2 6
0.2
0.2 bpt0 100 -6 26 -8 -50 3 4 14 13 20 89 3 -4 -34 25
0.002 0.1 100
bpt0 zmahm hTo dR_ dR_ btagbtag lepp zpt0hpt0hpt1nbje dEt dR_ mt2 mt2
ss ass1 ZZ_ leps bjet 0 1 t0 ts a_lb ZH _llm _ZH
0 0 0 mt_ s _min et met
50 100 150 200 250 300 350 400 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 cos
ine
Input variable: btag0 hToZZ_mt_cosine Input variable: btag1 dR_leps Input variable: leppt0 dR_bjets

7
(1/N) dN / 0.0205

(1/N) dN / 0.0205

(1/N) dN / 9.04

0.022
25 6 0.02 Correlation Matrix (background)
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%


0.018
20 5 0.016 Linear correlation coefficients in %
0.014 100
4 mt2_ZHmet 27 12 85 34 -6 21 18 20 7 7 13 14 20 -8 -14 26 100
15 0.012
3 0.01 mt2_llmet 20 -3 5 85 -50 -20 14 16 27 43 18 23 16 -5 -15 100 26 80
10 0.008 dR_ZH -10 4 -3 -18 10 14 -16 -19 -4 -7 -10 -12 -19 51 100 -15 -14
2 0.006 dEta_lb_min 2 2 -6 -5 2 1 -9 -9 2 -1 -2 -1 -11 100 51 -5 -8 60
5 1 0.004
0.002
nbjets 7 -5 7 19 -4 -11 79 78 -4 7 7 100 -11 -19 16 20
hpt1 86 10 25 -48 -77 7 8 53 60 60 100 7 -1 -12 23 14
40
0 0 0
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 50 100 150 200 250 300 350
hpt0 52
1 6 22 -65 -47 7 8 75 83 100 60 7 -2 -10 18 13
btag0 btag1 leppt0 20
zpt0 51 5 5 47 -80 -46 90 100 83 60 -1 -7 43 7
Input variable: zpt0 Input variable: hpt0 Input variable: hpt1 leppt0 48 15 4 40 -58 -39 -2 -2 100 90 75 53 -4 2 -4 27 7 0
(1/N) dN / 13

(1/N) dN / 14.2

(1/N) dN / 13.4

0.012 0.01
btag1 8 -5 7 18 -5 -11 65 100 -2 8 8 78 -9 -19 16 20
0.01 btag0 7 -5 7 17 -4 -9 100 65 -2 7 7 79 -9 -16 14 18 20
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (0.1, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (0.2, 0.1)%

0.01
0.008 dR_bjets -49 34 -22 42 100 -9 -11 -39 -46 -47 -77 -11 1 14 -20 21
0.008 40
0.008 dR_leps -38 15 -5 -43 100 42 -4 -5 -58 -80 -65 -48 -4 2 10 -50 -6
0.006
0.006 hToZZ_mt_cosine 22 6 100 -43 -22 17 18 40 47 22 25 19 -5 -18 85 34 60
0.006
0.004
hmass1 27 100 6 -5 34 7 7 4 5 6 10 7 -6 -3 5 85
0.004
0.004 zmass 100 15 -5 -5 15 5 1 -5 2 4 -3 12 80
0.002 0.002 0.002 bpt0 100 27 22 -38 -49 7 8 48 51 52 86 7 2 -10 20 27
100
bpt0 zma hm hTo dR_ dR_ btag btag lepp zpt0 hpt0 hpt1 nbje dEt dR_ mt2 mt2
0 0 0 ss ass1 ZZ_ leps bjet 0 1 t0 ts a_lb ZH _llm _ZH
mt_ s _min et met
100 200 300 400 500 100 200 300 400 500 100 200 300 400 500 cos
Rami Kamalieddin
zpt0 (UNL) HHbbZZ
hpt02b2l2 UNLHEP
hpt1 15 May 2017 ine
27

Binary classification, at mass, cont.

Background rejection versus Signal efficiency


1

Background rejection
0.9

ROC integral ! 0.8

for LD =0.903! 0.7


! 0.6
ROC integral ! 0.5
for BDT = 0.921! 0.4 MVA Method:
BDT
0.3 LD

0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Signal efficiency
TMVA overtraining check for classifier: LD TMVA overtraining check for classifier: BDT
(1/N) dN / dx

(1/N) dN / dx
Signal (test sample) Signal (training sample) Signal (test sample) Signal (training sample)
22 8
Background (test sample) Background (training sample) Background (test sample) Background (training sample)
20 7
Kolmogorov-Smirnov test: signal (background) probability = 0.341 (0.073) Kolmogorov-Smirnov test: signal (background) probability = 0.248 (0.005)
18
6
16
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


14 5
12
4
10
8 3

6 2
4
1
2
0 0
0.15 0.1 0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6
LD response BDT response

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


28

Binary classification, mid mass


Input variable: bpt0 Input variable: hmt0 Input variable: dR_leps
Correlation Matrix (signal)
Signal

(1/N) dN / 16

(1/N) dN / 0.113
(1/N) dN / 9.95

0.02 0.01
Background 1.4
0.018 Linear correlation coefficients in %

U/O-flow (S,B): (0.0, 0.0)% / (0.8, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (1.1, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


0.016 1.2 100
0.008 min_mt2_blmet 15 22 -21 45 -30 6 7 17 27 24 34 5 70 27 48 100
0.014 1 mt2_b1l2b2l1met 27 18 -4 40 -16 9 10 28 23 19 29 5 60 9 100 48 80
0.012 0.006
0.8 mt2_bbmet 11 21 15 35 -31 11 10 -12 -19 30 30 7 2 100 9 27
0.01
0.008 0.004 0.6 zhmass 34 17 -22 48 -24 3 4 36 38 21 40 2 100 2 60 70 60
0.006 nbjets
0.4 -1 5 -2 4 -4 63 79 5 5 6 2 100 2 7 5 5
0.004 0.002 40
0.2 hpt1 87 -11 19 46 -67 3 6 -13 -16 -10 100 2 40 30 29 34
0.002
0 0 0 hpt0 -7 91 -51 66 12 9 5 70 79 100 -10 6 21 30 19 24 20
50 100 150 200 250 300 350 400 100 200 300 400 500 600 700 0.5 1 1.5 2 2.5 3 3.5 4 4.5
zpt0 -14 76 -65 47 11 4 3 87 100 79 -16 5 38 -19 23 27
bpt0 hmt0 dR_leps
0
leppt0 -12 67 -33 42 9 6 2 100 87 70 -13 5 36 -12 28 17
Input variable: hhmt Input variable: dR_bjets Input variable: btag0
btag1 1 3 -2 4 -10 47 100 2 3 5 6 79 4 10 10 7 20
0.006
(1/N) dN / 0.11
(1/N) dN / 31.9

(1/N) dN / 0.0205
1.8 btag0 5 1 5 -5 100 47 6 4 9 3 63 3 11 9 6
25
0.005 1.6 40
U/O-flow (S,B): (0.0, 0.0)% / (0.2, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


dR_bjets -38 13 -9 -16 100 -5 -10 9 11 12 -67 -4 -24 -31 -16 -30
1.4
0.004 20 hhmt 45 73 -19 100 -16 5 4 42 47 66 46 4 48 35 40 45
1.2 60
dR_leps 16 -47 100 -19 -9 1 -2 -33 -65 -51 19 -2 -22 15 -4 -21
0.003 1 15
0.8 hmt0 -7 100 -47 73 13 5 3 67 76 91 -11 5 17 21 18 22 80
0.002 10 bpt0 100 -7 16 45 -38
0.6 1 -12 -14 -7 87 -1 34 11 27 15
100
0.4 bpt0 hm dR_ hhm dR_ btag btag lepp zpt0 hpt0 hpt1 nbje zhm mt2 mt2 min
0.001 5 t0 leps t bjet 0 1 t0 ts ass _bbm_b1l2_mt2
0.2 s et b2l1_blm
met et
0 0 0
200 btag1
Input variable: 400 600 800 1000 1200 1400 0.5
Input variable:1leppt0
1.5 2 2.5 3 3.5 4 4.5 0.3zpt0
Input variable: 0.4 0.5 0.6 0.7 0.8 0.9 1
hhmt dR_bjets btag0
Correlation Matrix (background)
(1/N) dN / 13.3
6
(1/N) dN / 9.2
(1/N) dN / 0.0205

0.022 0.012
0.02
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.7, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (1.0, 0.1)%


5 0.01
0.018
Linear correlation coefficients in %
4
0.016 100
0.014 0.008 min_mt2_blmet 49 49 -44 87 -37 -2 -2 51 56 50 54 -4 90 26 84 100
0.012 mt2_b1l2b2l1met 48 45 -37 82 -31 -3 -3 50 52 46 51 -5 87 21 100 84
3 0.006 80
0.01
mt2_bbmet 34 45 -17 37 -50 22 25 14 18 45 50 25 15 100 21 26
2 0.008 0.004
0.006 zhmass 55 45 -41 91 -29 -6 -7 57 59 49 56 -9 100 15 87 90 60
1 0.004 0.002 nbjets 7 14 -3 2 -11 79 78 -4 6 7 100 -9 25 -5 -4
0.002 40
0 0 0 hpt1 85 48 -47 63 -77 7 8 53 59 58 100 7 56 50 51 54
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 50 100 150 200 250 300 350 100 200 300 400 500
btag1 leppt0 zpt0
hpt0 49 79 -63 63 -45 6 8 72 80 100 58 6 49 45 46 50 20
zpt0 51 69 -80 60 -46 90 100 80 59 59 18 52 56
Input variable: hpt0 Input variable: hpt1 Input variable: nbjets 0
leppt0 47 66 -58 58 -39 -3 -2 100 90 72 53 -4 57 14 50 51
5
(1/N) dN / 15.1

(1/N) dN / 0.128
(1/N) dN / 13.8

0.01 btag1 8 15 -4 4 -11 65 100 -2 8 8 78 -7 25 -3 -2


0.01
20
U/O-flow (S,B): (0.0, 0.0)% / (1.3, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.6, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


4 btag0 7 13 -4 3 -9 100 65 -3 6 7 79 -6 22 -3 -2
0.008
0.008 dR_bjets -49 -37 42 -36 100 -9 -11 -39 -46 -45 -77 -11 -29 -50 -31 -37 40
3 hhmt 62 71 -44 100 -36
0.006
0.006
3 4 58 60 63 63 2 91 37 82 87 60
dR_leps -38 -51 100 -44 42 -4 -4 -58 -80 -63 -47 -3 -41 -17 -37 -44
0.004 2
0.004 hmt0 44 100 -51 71 -37 13 15 66 69 79 48 14 45 45 45 49 80
0.002 0.002 1 bpt0 100 44 -38 62 -49 7 8 47 51 49 85 7 55 34 48 49
100
bpt0 hm dR_ hhm dR_ btag btag lepp zpt0 hpt0 hpt1 nbje zhm mt2 mt2 min
t0 leps t bjet 0 1 t0 ts ass _bbm_b1l2_mt2
0 0 0 s et b2l1_blm
100 200 300 400 500 600 100 200 300 400 500 0 1 2 3 4 5
Rami Kamalieddin
hpt0 (UNL) HHbbZZ
hpt12b2l2 UNLHEP 15 May 2017
met et
nbjets
29

Binary classification, mid mass, cont.

Background rejection versus Signal efficiency


1

Background rejection
0.9

ROC integral ! 0.8


for LD = 0.981! 0.7
! 0.6
ROC integral ! 0.5
for BDT = 0.992!
0.4 MVA Method:
BDT
0.3 LD

0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Signal efficiency

TMVA overtraining check for classifier: LD TMVA overtraining check for classifier: BDT
20
(1/N) dN / dx

(1/N) dN / dx
Signal (test sample) Signal (training sample) 22 Signal (test sample) Signal (training sample)
18 Background (test sample) Background (training sample) 20 Background (test sample) Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.052 (0.001) Kolmogorov-Smirnov test: signal (background) probability = 0.652 ( 0)
16 18
14 16
14
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


12
12
10
10
8
8
6
6
4 4
2 2
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6
LD response BDT response

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


30

Binary classification, high mass


Input variable: leppt0 Input variable: zpt0 Input variable: hpt0
0.022
(1/N) dN / 9.98

(1/N) dN / 14.4

(1/N) dN / 16.4
0.012 0.01
0.02

U/O-flow (S,B): (0.0, 0.0)% / (2.8, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (1.2, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (2.7, 0.0)%


0.018
0.016
0.01
0.008 Correlation Matrix (signal)
0.014 0.008
0.006 Linear correlation coefficients in %
0.012 100
0.01 0.006 min_mt2_blmet 9 18 -29 44 -41 6 3 24 22 34 35 61 29 41 36 100
0.008 0.004
0.004
mt2_b1l2b2l1met 36 9 1 36 -25 10 25 19 12 36 36 57 4 -29 100 36 80
0.006
0.004 0.002
mt2_b1l1b2l2met 21 14 -10 34 -18 -4 23 24 18 25 26 50 5 100 -29 41
0.002 60
0.002 mt2_bbmet 9 27 9 45 -32 5 -17 -21 37 36 35 100 5 4 29
0 0 0 zhmass 39 19 -24 57 -39 6 40 47 25 50 51 100 50 57 61
50 100 150 200 250 300 350 400 100 200 300 400 500 100 200 300 400 500 600 40
leppt0 zpt0 hpt0 hpt1 83 -8 10 50 -61 3 -11 -15 -4 99 100 51 35 26 36 35
Input variable: hmt1 Input variable: hpt1 Input variable: zhmass hmt1 83 -8 11 51 -55 3 -11 -15 -3 100 99 50 36 25 36 34 20
0.007 hpt0 -4 90 -45 64 7 5 60 71 100 -3 -4 25 37 18 12 22
(1/N) dN / 13.2

(1/N) dN / 15.5

(1/N) dN / 30.5
0.01 0
0.01 0.006 zpt0 -14 69 -60 36 5 3 82 100 71 -15 -15 47 -21 24 19 24
U/O-flow (S,B): (0.0, 0.0)% / (1.9, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (1.2, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%


0.008
leppt0 -11 57 -18 31 6 3 100 82 60 -11 -11 40 -17 23 25 3 20
0.008 0.005
btag0 2 -2 4 -5 100 3 3 5 3 3 6 5 -4 10 6
0.004
0.006 0.006 dR_bjets -32 8 -24 100 -5 6 5 7 -55 -61 -39 -32 -18 -25 -41 40
0.003
0.004 hhmt 42 68 -16 100 -24 4 31 36 64 51 50 57 45 34 36 44
60
0.004
0.002 dR_leps 11 -43 100 -16 -2 -18 -60 -45 11 10 -24 9 -10 1 -29
0.002 0.002 80
0.001 hmt0 -8 100 -43 68 8 2 57 69 90 -8 -8 19 27 14 9 18
bpt0 100 -8 11 42 -32 -11 -14 -4 83 83 39 9 21 36 9
0 0 0
100 200 300 400 500 600 100 200 300 400 500 600 200 400 600 800 1000 1200 100
bpt0 hm dR_ hhm dR_ btag lepp zpt0 hpt0 hm hpt1 zhm mt2 mt2 mt2 min
hmt1 hpt1 zhmass t0 leps t bjets 0 t0 t1 ass _bbm_b1l1_b1l2_mt2
et b2l2 b2l1 _blm
met met et

Input variable: bpt0 Input variable: hmt0 Input variable: dR_leps


2.2
(1/N) dN / 11.1

Signal
(1/N) dN / 17

(1/N) dN / 0.114

0.02
Background 0.01 2 Correlation Matrix (background)
0.018 1.8
U/O-flow (S,B): (0.0, 0.0)% / (3.0, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (1.9, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


0.016 0.008 1.6 Linear correlation coefficients in %
0.014 1.4
100
min_mt2_blmet 50 51 -44 88 -38 -2 52 57 51 52 55 91 26 84 84 100
0.012 0.006 1.2
0.01
mt2_b1l2b2l1met 48 47 -37 83 -31 -3 51 53 48 51 51 87 21 60 100 84 80
1
0.008 0.004 0.8 mt2_b1l1b2l2met 50 48 -37 83 -32 -3 53 54 49 52 53 88 21 100 60 84
0.006 0.6 mt2_bbmet 34 46 -17 37 -50 22 14 18 46 40 51 16 100 21 21 26 60
0.004 0.002 0.4
zhmass 55 48 -41 92 -29 -6 57 59 51 60 56 100 16 88 87 91
0.002 0.2 40
0 0 0 hpt1 85 51 -48 63 -77 7 54 60 60 84 100 56 51 53 51 55
50 100 150 200 250 300 350 400 450 100 200 300 400 500 600 700 0.5 1 1.5 2 2.5 3 3.5 4 4.5
bpt0 hmt0 dR_leps hmt1 85 45 -38 66 -39 8 47 51 51 100 84 60 40 52 51 52 20
hpt0 52 77 -65 62 -47 7 75 83 100 51 60 51 46 49 48 51
Input variable: hhmt Input variable: dR_bjets Input variable: btag0 0
zpt0 52 72 -80 61 -46 90 100 83 51 60 59 18 54 53 57
3.5
(1/N) dN / 34.1

(1/N) dN / 0.11

25
(1/N) dN / 0.0205

0.005 leppt0 48 68 -58 59 -39 -2 100 90 75 47 54 57 14 53 51 52 20


3
U/O-flow (S,B): (0.0, 0.0)% / (0.1, 0.1)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

20 btag0 7 13 -4 3 -9 100 -2 7 8 7 -6 22 -3 -3 -2
0.004 2.5 40
dR_bjets -49 -39 42 -37 100 -9 -39 -46 -47 -39 -77 -29 -50 -32 -31 -38
2 15 hhmt 62 71 -44 100 -37 3 59 61 62 66 63 92 37 83 83 88
0.003
60
1.5 dR_leps -38 -53 100 -44 42 -4 -58 -80 -65 -38 -48 -41 -17 -37 -37 -44
0.002 10
1 hmt0 46 100 -53 71 -39 13 68 72 77 45 51 48 46 48 47 51 80
0.001 5 bpt0 100 46 -38 62 -49 7 48 52 52 85 85 55 34 50 48 50
0.5 100
bpt0 hm dR_ hhm dR_ btag lepp zpt0 hpt0 hm hpt1 zhm mt2 mt2 mt2 min
0 0 0
t0 leps t bjets 0 t0 t1 ass _bbm_b1l1_b1l2_mt2
et b2l2 b2l1 _blm
Rami Kamalieddin (UNL) HHbbZZ
dR_bjets 2b2l2 UNLHEP 15 May 2017
200 400 600 800 1000 1200 1400 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 met met et
hhmt btag0
31

Binary classification, high mass, cont.


Background rejection versus Signal efficiency
1

Background rejection
0.9

0.8

ROC integral ! 0.7


for LD = 0.996! 0.6
!
0.5
ROC integral !
0.4
for BDT = 0.997!
MVA Method:
BDT
0.3 LD

0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Signal efficiency

TMVA overtraining check for classifier: LD TMVA overtraining check for classifier: BDT

(1/N) dN / dx
Signal (test sample) Signal (training sample)
(1/N) dN / dx

12 Signal (test sample) Signal (training sample) 25


Background (test sample) Background (training sample)
Background (test sample) Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0( 0)
10 Kolmogorov-Smirnov test: signal (background) probability = 0.101 (0.807)
20

U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%


U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%

15

6
10
4

5
2

0 0
0.2 0.4 0.6 0.8 1 1.2 1.4 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8
LD response BDT response
Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017
32

MultiClass:
S vs TT vs DY

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


33

MultiClass
MultiClass: S vs TT vs DY for all 58 vars

+ All the information available in the MC is preserved


+ May give increase in performance when enough statistics is available
Request/produce new big signal samples?

Slow, easily scales from hours to days with more statistics, additional
different types of BG, increased # of BDT trees and # of layers
Solved running on GPU with tf/theano backend

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


34

MultiClass
MultiClass: S vs TT vs DY for all 58 vars
260+270 GeV 300-450 GeV 600+650 GeV 900+1000 GeV

None! of the variables appear in all sets

high mass region has fewer variables in common


wrt to other sets
BDT ROC integrals for: BDT ROC integrals for: BDT ROC integrals for: BDT ROC integrals for:
TT DY Signal TT DY Signal TT DY Signal TT DY Signal
0.025 0.003 0.997 0.172 0.007 0.999 0.328 0.289 0.999 0.190 0.783 1.000

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


35

Most information can be inferred


from previous slides,
so I will show only several plots

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


36

MultiClass, low mass Correlation Matrix (Signal)


Linear correlation coefficients in %
100
mt2_ZHmet 32 6 29 35 17 28 100
80

mt2_bbmet 29 4 7 57 1 100 28 60
Input variable: metpt Input variable: zmass

DY U/O-flow: 0.0 / 0.0 %TT U/O-flow: 0.0 / 0.1 %


DY 0.16 40

dN / 1.79
dN / 8.77

0.03 mt2_llmet 12 -8 81 45 100 1 17


Signal
TT
0.14 20
0.025
0.12
_mt_cosine 36 -3 57 100 45 57 35 0
0.02 0.1

0.08 20
0.015 _mt_cosine 29 -9 100 57 81 7 29
0.06 40
0.01
0.04 zmass -10 100 -9 -3 -8 4 6 60
0.005
0.02

0 0 80
50 100 150 200 250 300 350 60 70 80 90 100 110 120 metpt 100 -10 29 36 12 29 32
metpt zmass 100
met zma hTo hh_ mt2 mt2 mt2
pt ss ZZ_ mt_ _llm _bb _ZH
mt_ cos et met met
cos ine
ine

Correlation Matrix (DY) Correlation Matrix (TT)


Linear correlation coefficients in % Linear correlation coefficients in %
100 100
mt2_ZHmet 23 9 21 23 16 12 100 mt2_ZHmet 53 31 48 56 29 28 100
80 80

mt2_bbmet 30 4 42 100 12 60 mt2_bbmet 54 -1 2 68 -8 100 28 60

40 40
mt2_llmet 28 2 85 45 100 16 mt2_llmet 21 -5 80 33 100 -8 29
20 20

_mt_cosine 39 2 55 100 45 42 23 0 _mt_cosine 66 52 100 33 68 56 0

20 20
_mt_cosine 38 5 100 55 85 4 21 _mt_cosine 41 2 100 52 80 2 48
40 40
zmass 100 5 2 2 9 zmass 100 2 -5 -1 31
60 60

80 80
metpt 100 38 39 28 30 23 metpt 100 41 66 21 54 53

100 100
met zma hTo hh_ mt2 mt2 mt2 met zma hTo hh_ mt2 mt2 mt2
pt ss ZZ_ mt_ _llm _bb _ZH pt ss ZZ_ mt_ _llm _bb _ZH
mt_ cos et met met mt_ cos et met met
cos ine cos ine
ine ine
37

MultiClass, low mass, cont.

BDT ROC integrals for:


TT DY Signal
0.34 0.056 0.997

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


38

MultiClass, at mass
Correlation Matrix (Signal)
Linear correlation coefficients in %
100
mt2_ZHmet 20 10 19 32 38 26 100
80
Input variable: metpt Input variable: zmass

TT U/O-flow: 0.0 / 0.1 %DY U/O-flow: 0.0 / 0.0 %


TT 0.16 min_mt2_blmet 16 11 16 28 87 100 26 60

dN / 1.79
dN / 8.2

0.03
DY
Signal
0.14 40
0.025 zhmass 5 21 15 17 100 87 38
0.12
20
0.02 0.1
hh_mt_cosine 41 -14 6 100 17 28 32 0
0.015 0.08
20
0.06 hToZZ_mt_cosine 1 6 100 6 15 16 19
0.01
0.04 40
0.005 zmass -21 100 6 -14 21 11 10
0.02 60
0 0
50 100 150 200 250 300 60 70 80 90 100 110 120
metpt
80
100 -21 1 41 5 16 20
metpt zmass
met
100
zma hTo hh_ zhm min mt2
pt ss ZZ_ mt_ ass _mt _ZH
mt_ cos 2_b met
cos ine lme
ine t

Correlation Matrix (DY) Correlation Matrix (TT)


Linear correlation coefficients in % Linear correlation coefficients in %
100 100
mt2_ZHmet 23 9 20 22 27 17 100 mt2_ZHmet 53 30 48 56 30 30 100
80 80

min_mt2_blmet 11 7 31 23 92 100 17 60 min_mt2_blmet 36 10 28 41 65 100 30 60

40 40
zhmass 6 9 29 20 100 92 27 zhmass -2 32 7 100 65 30
20 20

hh_mt_cosine 48 2 54 100 20 23 22 0 hh_mt_cosine 66 52 100 41 56 0

20 20
hToZZ_mt_cosine 44 6 100 54 29 31 20 hToZZ_mt_cosine 42 2 100 52 7 28 48
40 40
zmass 100 6 2 9 7 9 zmass 100 2 32 10 30
60 60

80 metpt
80
metpt 100 44 48 6 11 23 100 42 66 -2 36 53

100 met
100
met zma hTo hh_ zhm min mt2 zma hTo hh_ zhm min mt2
pt ss ZZ_ mt_ ass _mt _ZH pt ss ZZ_ mt_ ass _mt _ZH
mt_ cos 2_b met mt_ cos 2_b met
cos ine lme cos ine lme
ine t ine t
39

MultiClass, at mass, cont.

BDT ROC integrals for:


TT DY Signal
0.172 0.007 0.999

TMVA response for classifier: BDT TMVA response for classifier: BDT TMVA response for classifier: BDT
(1/N) dN / dx

(1/N) dN / dx

(1/N) dN / dx
TT TT (training sample) TT TT (training sample) TT TT (training sample)
25 DY DY (training sample) 30 DY DY (training sample) DY DY (training sample)
Signal Signal (training sample) Signal Signal (training sample) 25 Signal Signal (training sample)

25
20
20
20
15
15
15
10 10
10

5 5
5

0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
BDT response for Signal BDT response for DY BDT response for TT

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


40

MultiClass, mid mass


Correlation Matrix (Signal)
Linear correlation coefficients in %
100
mt2_ZHmet 23 -4 10 5 -1 -6 28 14 25 100
80
min_mt2_blmet 4 3 16 1 9 9 73 51 100 25
60
TT U/O-flow: 0.0 / 0.1 %DY U/O-flow: 0.0 / 0.0 %Signal U/O-flow: 0.0 / 0.3 %

mt2_b1l2b2l1met 1 5 12 1 9 4 61 100 51 14
40
zhmass -15 2 30 3 6 11 100 61 73 28
20
dR_ZH -26 -3 -9 -3 100 11 4 9 -6
Input variable: metpt Input variable: btag0 Input variable: zmass
0.16 0
TT

dN / 1.79
dN / 0.0205
dN / 8.28

0.03 nbjets 3 64 3 -3 100 -3 6 9 9 -1


DY 25
0.14
0.025
Signal
20
0.12 hToZZ_mt_cosine -13 -6 5 100 -3 -9 3 5
20 1 1
0.02 0.1 40
15 zmass
0.08 -22 100 5 3 30 12 16 10
0.015
10 0.06 60
0.01 btag0 4 100 -6 64 -3 2 5 3 -4
0.04
0.005 5 80
0.02 metpt 100 -13 3 -26 -15 23
4 -22 1 4
0
50 100 150 200 250 300
0
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
60 70 80 90 100 110 120 met
100
btag zma hTo n dR_ z m m m
metpt btag0 zmass pt 0 ss ZZ_ bjets ZH hmass t2_b1 in_mt2 t2_ZH
mt_ l2b2 m
cos l1m _blmet et
ine et

Correlation Matrix (DY) Correlation Matrix (TT)


Linear correlation coefficients in % Linear correlation coefficients in %
100 100
mt2_ZHmet 23 3 9 20 2 -1 27 18 17 100 mt2_ZHmet 53 2 30 48 1 -24 30 29 30 100
80 80
min_mt2_blmet 11 7 31 -1 38 92 84 100 17 min_mt2_blmet 36 -4 10 28 -6 21 65 69 100 30
60 60
mt2_b1l2b2l1met 8 7 28 -1 36 88 100 84 18 mt2_b1l2b2l1met 25 -3 15 22 -4 23 68 100 69 29
40 40
zhmass 6 9 29 -1 38 100 88 92 27 zhmass -2 -4 32 7 -5 40 100 68 65 30
20 20
dR_ZH -3 -3 1 -8 -4 100 38 36 38 -1 dR_ZH -23 -3 -24 -5 100 40 23 21 -24
0 0
nbjets 1 76 3 100 -4 -1 -1 -1 2 nbjets 64 100 -5 -5 -4 -6 1
20 20
hToZZ_mt_cosine 44 3 6 100 3 -8 29 28 31 20 hToZZ_mt_cosine 42 2 100 -24 7 22 28 48
40 40
zmass 100 6 1 9 7 7 9 zmass 100 2 32 15 10 30
60 60
btag0 2 100 3 76 -3 3 btag0 100 64 -3 -4 -3 -4 2
80 80
metpt 100 2 44 1 -3 6 8 11 23 metpt 100 42 -23 -2 25 36 53
met
100 100
btag zma hTo n dR_ z m m m met btag zma hTo n dR_ z m m m
pt 0 ss ZZ_ bjets ZH hmass t2_b1 in_mt2 t2_ZH pt 0 ss ZZ_ bjets ZH hmass t2_b1 in_mt2 t2_ZH
mt_ l2b2 m mt_ l2b2 m
cos l1m _blmet et cos l1m _blmet et
ine et ine et
41

MultiClass, mid mass, cont.

BDT ROC integrals for:


TT DY Signal
0.328 0.289 0.999

TMVA response for classifier: BDT TMVA response for classifier: BDT TMVA response for classifier: BDT
24
(1/N) dN / dx

(1/N) dN / dx

(1/N) dN / dx
TT TT (training sample) TT TT (training sample) TT TT (training sample)
DY DY (training sample)
35 DY DY (training sample) DY DY (training sample)
35 22
Signal Signal (training sample) Signal Signal (training sample) Signal Signal (training sample)
30 20
30
18
25 25 16
14
20 20
12

15 15 10
8
10 10
6
4
5 5
2
0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
BDT response for Signal BDT response for DY BDT response for TT

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


42

MultiClass, high mass

Signal U/O-flow: 0.0 / 3.5 %TT U/O-flow: 0.0 / 0.0 %DY U/O-flow: 0.0 / 0.1 %

Signal U/O-flow: 0.0 / 3.1 %DY U/O-flow: 0.0 / 0.1 %

0.0 / 1.4 %DY U/O-flow: 0.0 / 0.1 %


Input variable: bpt0 Input variable: leppt0 Input variable: zpt0
0.024 Signal Correlation Matrix (Signal)
dN / 11

dN / 14
dN / 9.75
0.022 TT 0.022 0.012
DY 0.02
0.02 Linear correlation coefficients in %
0.018 0.018 0.01 100
0.016 min_mt2_blmet 9 3 24 18 22 34 35 -41 -29 44 61 41 36 100
0.016
0.014 0.014 0.008
0.012
mt2_b1l2b2l1met 36 25 19 9 12 36 36 -25 1 36 57 -29 100 36 80
0.012
0.006
0.01 0.01 mt2_b1l1b2l2met 21 23 24 14 18 25 26 -18 -10 34 50 100 -29 41
0.008 60

0.0 /U/O-flow:
0.008 0.004

0.0 %
0.006 0.006 zhmass 39 40 47 19 25 50 51 -39 -24 57 100 50 57 61
0.004 0.004 0.002 hhmt 42 31 36 68 64 51 50 -24 -16 100 57 34 36 44
40

Signal
0.002 0.002

Signal U/O-flow: 0.0 / 2.5 %TT U/O-flow: 0.0 / 0.0 %DY U/O-flow:
0 0 0 dR_leps 11 -18 -60 -43 -45 11 10 100 -16 -24 -10 1 -29
50 100 150 200 250 300 350 400 450 50 100 150 200 250 300 350 400 100 200 300 400 500 20
bpt0 leppt0 zpt0 dR_bjets -32 6 5 8 7 -55 -61 100 -24 -39 -18 -25 -41
0
Signal U/O-flow: 0.0 / 1.5 %DY U/O-flow: 0.0 / 0.0 %

Signal U/O-flow: 0.0 / 2.4 %DY U/O-flow: 0.0 / 0.0 %


Input variable: hmt0 Input variable: hpt0 Input variable: hmt1 hpt1 83 -11 -15 -8 -4 99 100 -61 10 50 51 26 36 35
0.012
0.012 20

dN / 13
dN / 17.6

dN / 16.6

hmt1 83 -11 -15 -8 -3 100 99 -55 11 51 50 25 36 34


0.01 0.01
0.01 hpt0 -4 60 71 90 100 -3 -4 7 -45 64 25 18 12 22
40
0.008 0.008 hmt0 -8 57 69 100 90 -8 -8 8 -43 68 19 14 9 18
0.008

0.006 zpt0 -14 82 100 69 71 -15 -15 5 -60 36 47 24 19 24


60
0.006 0.006

leppt0 -11 100 82 57 60 -11 -11 6 -18 31 40 23 25 3 80


0.004 0.004 0.004
bpt0 100 -11 -14 -8 -4 83 83 -32 11 42 39 21 36 9
0.002 0.002 0.002 100
bpt0 lepp zpt0 hm hpt0 hm hpt1 dR_ dR_ hhm zhm mt2 mt2 min
t0 t0 t1 bjet leps t
s ass _b1l1_b1l2_mt2
0 0 0 b2l2 b2l1 _blm
100 200 300 400 500 600 700 100 200 300 400 500 600 100 200 300 400 500 met met et
hmt0 hpt0 hmt1

Correlation Matrix (DY) Correlation Matrix (TT)


Linear correlation coefficients in % Linear correlation coefficients in %
100 100
min_mt2_blmet 53 53 58 55 54 55 58 -40 -48 89 92 84 84 100 min_mt2_blmet 30 28 34 48 39 38 45 -35 -23 68 65 69 69 100
mt2_b1l2b2l1met 51 52 54 51 50 54 54 -34 -41 85 88 61 100 84 80 mt2_b1l2b2l1met 31 32 32 41 33 38 38 -23 -16 63 68 27 100 69 80
mt2_b1l1b2l2met 53 54 55 53 51 55 56 -35 -41 86 89 100 61 84 mt2_b1l1b2l2met 34 33 33 41 33 39 40 -24 -16 61 67 100 27 69
60 60
zhmass 59 58 60 56 55 64 61 -34 -45 95 100 89 88 92 zhmass 41 44 43 27 26 54 37 -9 -18 62 100 67 68 65
hhmt 63 61 64 71 63 66 64 -38 -48 100 95 86 85 89
40 40
hhmt 52 38 33 82 60 57 51 -24 -12 100 62 61 63 68
dR_leps -43 -62 -81 -60 -70 -43 -54 48 100 -48 -45 -41 -41 -48 20 dR_leps -5 -37 -75 -14 -23 -5 -11 12 100 -12 -18 -16 -16 -23 20
dR_bjets -51 -45 -51 -39 -47 -39 -76 100 48 -38 -34 -35 -34 -40 dR_bjets -34 -10 -14 -16 -32 -28 -78 100 12 -24 -9 -24 -23 -35
0 0
hpt1 87 59 65 53 60 85 100 -76 -54 64 61 56 54 58 hpt1 74 14 16 23 40 75 100 -78 -11 51 37 40 38 45

hmt1 86 51 55 46 51 100 85 -39 -43 66 64 55 54 55 20 hmt1 80 10 10 17 28 100 75 -28 -5 57 54 39 38 38 20


hpt0 52 80 87 82 100 51 60 -47 -70 63 55 51 50 54 hpt0 29 32 37 62 100 28 40 -32 -23 60 26 33 33 39
40 40
hmt0 47 75 78 100 82 46 53 -39 -60 71 56 53 51 55 hmt0 20 45 39 100 62 17 23 -16 -14 82 27 41 41 48
zpt0 56 91 100 78 87 55 65 -51 -81 64 60 55 54 58
60 zpt0 60
11 79 100 39 37 10 16 -14 -75 33 43 33 32 34
leppt0 52 100 91 75 80 51 59 -45 -62 61 58 54 52 53 80 leppt0 13 100 79 45 32 10 14 -10 -37 38 44 33 32 28 80
bpt0 100 52 56 47 52 86 87 -51 -43 63 59 53 51 53 bpt0 100 13 11 20 29 80 74 -34 -5 52 41 34 31 30
100 100
bpt0 lepp zpt0 hm hpt0 hm hpt1 dR_ dR_ hhm zhm mt2 mt2 min bpt0 lepp zpt0 hm hpt0 hm hpt1 dR_ dR_ hhm zhm mt2 mt2 min
t0 t0 t1 bjet leps t
s ass _b1l1_b1l2_mt2 t0 t0 t1 bjet leps t
s ass _b1l1_b1l2_mt2
b2l2 b2l1 _blm b2l2 b2l1 _blm
met met et met met et
43

MultiClass, high mass, cont.

BDT ROC integrals for:


TT DY Signal
0.190 0.783 1.000

TMVA response for classifier: BDT TMVA response for classifier: BDT TMVA response for classifier: BDT
50
(1/N) dN / dx

(1/N) dN / dx

(1/N) dN / dx
Signal Signal (training sample) Signal Signal (training sample) Signal Signal (training sample)
TT TT (training sample) 45 TT TT (training sample) 45 TT TT (training sample)
DY DY (training sample) DY DY (training sample) DY DY (training sample)
40 40
40
35 35
30 30
30
25 25

20 20 20
15 15

10 10 10

5 5

0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
BDT response for Signal BDT response for DY BDT response for TT

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


44

Conclusions
Analysis strategy is clear binned shape analysis using Higgs Combine

Choose BDT over CnC for performance reasons/practicality

Multiclass given available statistics does not add extra power

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


45

1. BDTs:
OPEN questions
Can BDT to optimize selection (n b-jets, Zll, Hbb, metpt etc) contain
variables to be used later for BDT to derive shape (made of e.g.,
HH_mt, ZH_mt2, btag0, etc) for Combine?
Highly correlated and low rank variables in the BDT shape for
Combine? Drop or keep?

2. Variables to define CRs (Zll, Hbb), include to selection BDT, include


to BDT for Combine? OR can we have just one grand BDT? (no split
into selection and shape variables/BDTs). Optimized only in SR?

3. How many mass regions to train, at how many points


evaluate the limit in order to set the best one we can? Is it ok to have
different variables in different sets?

4. Which simpler option can be missing?

5. Probably need specific bbZZ samples


Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017
46

BACK UP

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017


47

Building HH candidate, v1

At least 2 b-jets

90 GeV < Hbb mass < 150 GeV

Exactly 2 leptons

76 GeV < Dilepton mass < 106 GeV,


to be separated from bbWW

Missing ET > 20 GeV

Transverse mass of HH > 250 GeV

Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017

Anda mungkin juga menyukai