to bbZZ to 2b 2l 2nu
Rami Kamalieddin, Ilya Kravchenko (UNL)
Michele de Gruttola (CERN)
Lesya Shchutska (ETH)
OUTLINE
Analysis signature, strategy
Analysis
Future plans
HH bbZZ 2b 2l 2nu
2b 2l 2 signature
MET from the off shell Z* boson
H ZZ*
H bb
2 b-jets from the Higgs bb
Analysis strategy
Analysis strategy
Our objective
Simplest shape
Possibilities:
Run Punzi significance or alike Use MVA methods (LD, BDT, DNN),
on 55 cuts (vary each cut at then simply cut on the response of
least twice down and up) the produced MVA distribution
BDT shape
BDT shape
Possibilities:
N-dimensional BDT
N-dimensional BDT
4. N-dimensional BDT shapes mapped into 1D hist for Higgs Combination Tool
Type of training?
Binary classification (S vs total BG)
MultiClass classification (S vs TT vs DY)
Binary classification:
S vs total BG
Binary classification
Binary classification: S vs total BG (TT+DY) of 58 vars
+ Fast
Information about correlations among BG samples is lost
260+270 GeV 300-450 GeV 600+650 GeV 900+1000 GeV
Signal 5
(1/N) dN / 21.9
(1/N) dN / 0.128
/ 0.0205
(1/N) dN / 0.0205
/ 21.4
0.016 0.014 0.014
30
Background 6
0.014
0.0)%
(0.0, 0.1)%
0.012 0.012
4
dNdN
25 5
0.012
0.0)% // (0.0,
0.01 0.01
(1/N)
(1/N)
0.01 20 4 3
(0.0, 0.0)%
0.008 0.008
0.008 15 3
(S,B): (0.0,
0.006 0.006 2
0.006
U/O-flow (S,B):
0.004 10 2
0.004
0.004
1
U/O-flow
0.002 0.002 0.0025 1
0 0 00 0 0
200 400 600 800 1000 1200 100 200 300 400 500 600 700 800 0.32000.43000.54000.65000.76000.8
100 7000.9
800 1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5
zhmass mt2_b1l1b2l2met btag0
mt2_b1l2b2l1met btag1 nbjets
Input
Input variable:
variable: mt2_b1l1b2l2met
min_mt2_blmet Input variable: mt2_b1l2b2l1met Input variable: dR_lb_min Input variable: dR_ZH Input variable: dPhi_lb_min
(1/N) dN / 21.9
(1/N) dN / 21.4
(1/N) dN / 0.103
0.8
(1/N) dN / 0.0794
(1/N) dN / 0.26
(1/N) dN / 16.6
1.6
0.018
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%
0.016
0.0140.01 0.01 1.2 0.6 1
0.012
0.008 0.008 1 0.5
0.8
0.01 0.4
0.006 0.006 0.8
0.008 0.6
0.6 0.3
0.004
0.006 0.004
0.4
0.4 0.2
0.004
0.002 0.002
0.2 0.1 0.2
0.002 0 0
1000 1200 100 200 300 400 500 600 700 800 100 200 300 400 500 600 700 800
0 0 0 0
zhmass 100 200 300 400 mt2_b1l1b2l2met
500 600 mt2_b1l2b2l1met 0.5 1 1.5 2 2.5 3 3.5 4 4.5 2 4 6 8 10 0 0.5 1 1.5 2 2.5 3
min_mt2_blmet dR_lb_min dR_ZH dPhi_lb_min
Background rejection
0.9
! 0.6
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Signal efficiency
TMVA overtraining check for classifier: LD TMVA overtraining check for classifier: BDT
(1/N) dN / dx
(1/N) dN / dx
Signal (test sample) Signal (training sample) Signal (test sample) Signal (training sample)
90
Background (test sample) Background (training sample) 14 Background (test sample) Background (training sample)
80 Kolmogorov-Smirnov test: signal (background) probability = 0.049 (0.168) Kolmogorov-Smirnov test: signal (background) probability = 0.409 ( 0)
12
70
60 10
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%
10 2
0 0
0.04 0.02 0 0.02 0.04 0.06 0.8 0.6 0.4 0.2 0 0.2 0.4
LD response BDT response
Signal
(1/N) dN / 1.79
(1/N) dN / 2.56
0.02 0.14
0.018
Background
(1/N) dN / 0.113
(1/N) dN / 0.116
0.8 20
0.012 btag0 3 1 -7 4 -14 100 46 -9 -8 -6 7 62 -3 -3 1 -8
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%
7
(1/N) dN / 0.0205
(1/N) dN / 0.0205
(1/N) dN / 9.04
0.022
25 6 0.02 Correlation Matrix (background)
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%
(1/N) dN / 14.2
(1/N) dN / 13.4
0.012 0.01
btag1 8 -5 7 18 -5 -11 65 100 -2 8 8 78 -9 -19 16 20
0.01 btag0 7 -5 7 17 -4 -9 100 65 -2 7 7 79 -9 -16 14 18 20
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.1)%
0.01
0.008 dR_bjets -49 34 -22 42 100 -9 -11 -39 -46 -47 -77 -11 1 14 -20 21
0.008 40
0.008 dR_leps -38 15 -5 -43 100 42 -4 -5 -58 -80 -65 -48 -4 2 10 -50 -6
0.006
0.006 hToZZ_mt_cosine 22 6 100 -43 -22 17 18 40 47 22 25 19 -5 -18 85 34 60
0.006
0.004
hmass1 27 100 6 -5 34 7 7 4 5 6 10 7 -6 -3 5 85
0.004
0.004 zmass 100 15 -5 -5 15 5 1 -5 2 4 -3 12 80
0.002 0.002 0.002 bpt0 100 27 22 -38 -49 7 8 48 51 52 86 7 2 -10 20 27
100
bpt0 zma hm hTo dR_ dR_ btag btag lepp zpt0 hpt0 hpt1 nbje dEt dR_ mt2 mt2
0 0 0 ss ass1 ZZ_ leps bjet 0 1 t0 ts a_lb ZH _llm _ZH
mt_ s _min et met
100 200 300 400 500 100 200 300 400 500 100 200 300 400 500 cos
Rami Kamalieddin
zpt0 (UNL) HHbbZZ
hpt02b2l2 UNLHEP
hpt1 15 May 2017 ine
27
Background rejection
0.9
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Signal efficiency
TMVA overtraining check for classifier: LD TMVA overtraining check for classifier: BDT
(1/N) dN / dx
(1/N) dN / dx
Signal (test sample) Signal (training sample) Signal (test sample) Signal (training sample)
22 8
Background (test sample) Background (training sample) Background (test sample) Background (training sample)
20 7
Kolmogorov-Smirnov test: signal (background) probability = 0.341 (0.073) Kolmogorov-Smirnov test: signal (background) probability = 0.248 (0.005)
18
6
16
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%
6 2
4
1
2
0 0
0.15 0.1 0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6
LD response BDT response
(1/N) dN / 16
(1/N) dN / 0.113
(1/N) dN / 9.95
0.02 0.01
Background 1.4
0.018 Linear correlation coefficients in %
(1/N) dN / 0.0205
1.8 btag0 5 1 5 -5 100 47 6 4 9 3 63 3 11 9 6
25
0.005 1.6 40
U/O-flow (S,B): (0.0, 0.0)% / (0.2, 0.1)%
0.022 0.012
0.02
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%
(1/N) dN / 0.128
(1/N) dN / 13.8
Background rejection
0.9
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Signal efficiency
TMVA overtraining check for classifier: LD TMVA overtraining check for classifier: BDT
20
(1/N) dN / dx
(1/N) dN / dx
Signal (test sample) Signal (training sample) 22 Signal (test sample) Signal (training sample)
18 Background (test sample) Background (training sample) 20 Background (test sample) Background (training sample)
Kolmogorov-Smirnov test: signal (background) probability = 0.052 (0.001) Kolmogorov-Smirnov test: signal (background) probability = 0.652 ( 0)
16 18
14 16
14
U/O-flow (S,B): (0.0, 0.0)% / (0.0, 0.0)%
(1/N) dN / 14.4
(1/N) dN / 16.4
0.012 0.01
0.02
(1/N) dN / 15.5
(1/N) dN / 30.5
0.01 0
0.01 0.006 zpt0 -14 69 -60 36 5 3 82 100 71 -15 -15 47 -21 24 19 24
U/O-flow (S,B): (0.0, 0.0)% / (1.9, 0.0)%
Signal
(1/N) dN / 17
(1/N) dN / 0.114
0.02
Background 0.01 2 Correlation Matrix (background)
0.018 1.8
U/O-flow (S,B): (0.0, 0.0)% / (3.0, 0.1)%
(1/N) dN / 0.11
25
(1/N) dN / 0.0205
20 btag0 7 13 -4 3 -9 100 -2 7 8 7 -6 22 -3 -3 -2
0.004 2.5 40
dR_bjets -49 -39 42 -37 100 -9 -39 -46 -47 -39 -77 -29 -50 -32 -31 -38
2 15 hhmt 62 71 -44 100 -37 3 59 61 62 66 63 92 37 83 83 88
0.003
60
1.5 dR_leps -38 -53 100 -44 42 -4 -58 -80 -65 -38 -48 -41 -17 -37 -37 -44
0.002 10
1 hmt0 46 100 -53 71 -39 13 68 72 77 45 51 48 46 48 47 51 80
0.001 5 bpt0 100 46 -38 62 -49 7 48 52 52 85 85 55 34 50 48 50
0.5 100
bpt0 hm dR_ hhm dR_ btag lepp zpt0 hpt0 hm hpt1 zhm mt2 mt2 mt2 min
0 0 0
t0 leps t bjets 0 t0 t1 ass _bbm_b1l1_b1l2_mt2
et b2l2 b2l1 _blm
Rami Kamalieddin (UNL) HHbbZZ
dR_bjets 2b2l2 UNLHEP 15 May 2017
200 400 600 800 1000 1200 1400 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 met met et
hhmt btag0
31
Background rejection
0.9
0.8
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Signal efficiency
TMVA overtraining check for classifier: LD TMVA overtraining check for classifier: BDT
(1/N) dN / dx
Signal (test sample) Signal (training sample)
(1/N) dN / dx
15
6
10
4
5
2
0 0
0.2 0.4 0.6 0.8 1 1.2 1.4 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8
LD response BDT response
Rami Kamalieddin (UNL) HHbbZZ2b2l2 HH meeting 23 May 2017
32
MultiClass:
S vs TT vs DY
MultiClass
MultiClass: S vs TT vs DY for all 58 vars
Slow, easily scales from hours to days with more statistics, additional
different types of BG, increased # of BDT trees and # of layers
Solved running on GPU with tf/theano backend
MultiClass
MultiClass: S vs TT vs DY for all 58 vars
260+270 GeV 300-450 GeV 600+650 GeV 900+1000 GeV
mt2_bbmet 29 4 7 57 1 100 28 60
Input variable: metpt Input variable: zmass
dN / 1.79
dN / 8.77
0.08 20
0.015 _mt_cosine 29 -9 100 57 81 7 29
0.06 40
0.01
0.04 zmass -10 100 -9 -3 -8 4 6 60
0.005
0.02
0 0 80
50 100 150 200 250 300 350 60 70 80 90 100 110 120 metpt 100 -10 29 36 12 29 32
metpt zmass 100
met zma hTo hh_ mt2 mt2 mt2
pt ss ZZ_ mt_ _llm _bb _ZH
mt_ cos et met met
cos ine
ine
40 40
mt2_llmet 28 2 85 45 100 16 mt2_llmet 21 -5 80 33 100 -8 29
20 20
20 20
_mt_cosine 38 5 100 55 85 4 21 _mt_cosine 41 2 100 52 80 2 48
40 40
zmass 100 5 2 2 9 zmass 100 2 -5 -1 31
60 60
80 80
metpt 100 38 39 28 30 23 metpt 100 41 66 21 54 53
100 100
met zma hTo hh_ mt2 mt2 mt2 met zma hTo hh_ mt2 mt2 mt2
pt ss ZZ_ mt_ _llm _bb _ZH pt ss ZZ_ mt_ _llm _bb _ZH
mt_ cos et met met mt_ cos et met met
cos ine cos ine
ine ine
37
MultiClass, at mass
Correlation Matrix (Signal)
Linear correlation coefficients in %
100
mt2_ZHmet 20 10 19 32 38 26 100
80
Input variable: metpt Input variable: zmass
dN / 1.79
dN / 8.2
0.03
DY
Signal
0.14 40
0.025 zhmass 5 21 15 17 100 87 38
0.12
20
0.02 0.1
hh_mt_cosine 41 -14 6 100 17 28 32 0
0.015 0.08
20
0.06 hToZZ_mt_cosine 1 6 100 6 15 16 19
0.01
0.04 40
0.005 zmass -21 100 6 -14 21 11 10
0.02 60
0 0
50 100 150 200 250 300 60 70 80 90 100 110 120
metpt
80
100 -21 1 41 5 16 20
metpt zmass
met
100
zma hTo hh_ zhm min mt2
pt ss ZZ_ mt_ ass _mt _ZH
mt_ cos 2_b met
cos ine lme
ine t
40 40
zhmass 6 9 29 20 100 92 27 zhmass -2 32 7 100 65 30
20 20
20 20
hToZZ_mt_cosine 44 6 100 54 29 31 20 hToZZ_mt_cosine 42 2 100 52 7 28 48
40 40
zmass 100 6 2 9 7 9 zmass 100 2 32 10 30
60 60
80 metpt
80
metpt 100 44 48 6 11 23 100 42 66 -2 36 53
100 met
100
met zma hTo hh_ zhm min mt2 zma hTo hh_ zhm min mt2
pt ss ZZ_ mt_ ass _mt _ZH pt ss ZZ_ mt_ ass _mt _ZH
mt_ cos 2_b met mt_ cos 2_b met
cos ine lme cos ine lme
ine t ine t
39
TMVA response for classifier: BDT TMVA response for classifier: BDT TMVA response for classifier: BDT
(1/N) dN / dx
(1/N) dN / dx
(1/N) dN / dx
TT TT (training sample) TT TT (training sample) TT TT (training sample)
25 DY DY (training sample) 30 DY DY (training sample) DY DY (training sample)
Signal Signal (training sample) Signal Signal (training sample) 25 Signal Signal (training sample)
25
20
20
20
15
15
15
10 10
10
5 5
5
0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
BDT response for Signal BDT response for DY BDT response for TT
mt2_b1l2b2l1met 1 5 12 1 9 4 61 100 51 14
40
zhmass -15 2 30 3 6 11 100 61 73 28
20
dR_ZH -26 -3 -9 -3 100 11 4 9 -6
Input variable: metpt Input variable: btag0 Input variable: zmass
0.16 0
TT
dN / 1.79
dN / 0.0205
dN / 8.28
TMVA response for classifier: BDT TMVA response for classifier: BDT TMVA response for classifier: BDT
24
(1/N) dN / dx
(1/N) dN / dx
(1/N) dN / dx
TT TT (training sample) TT TT (training sample) TT TT (training sample)
DY DY (training sample)
35 DY DY (training sample) DY DY (training sample)
35 22
Signal Signal (training sample) Signal Signal (training sample) Signal Signal (training sample)
30 20
30
18
25 25 16
14
20 20
12
15 15 10
8
10 10
6
4
5 5
2
0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
BDT response for Signal BDT response for DY BDT response for TT
Signal U/O-flow: 0.0 / 3.5 %TT U/O-flow: 0.0 / 0.0 %DY U/O-flow: 0.0 / 0.1 %
dN / 14
dN / 9.75
0.022 TT 0.022 0.012
DY 0.02
0.02 Linear correlation coefficients in %
0.018 0.018 0.01 100
0.016 min_mt2_blmet 9 3 24 18 22 34 35 -41 -29 44 61 41 36 100
0.016
0.014 0.014 0.008
0.012
mt2_b1l2b2l1met 36 25 19 9 12 36 36 -25 1 36 57 -29 100 36 80
0.012
0.006
0.01 0.01 mt2_b1l1b2l2met 21 23 24 14 18 25 26 -18 -10 34 50 100 -29 41
0.008 60
0.0 /U/O-flow:
0.008 0.004
0.0 %
0.006 0.006 zhmass 39 40 47 19 25 50 51 -39 -24 57 100 50 57 61
0.004 0.004 0.002 hhmt 42 31 36 68 64 51 50 -24 -16 100 57 34 36 44
40
Signal
0.002 0.002
Signal U/O-flow: 0.0 / 2.5 %TT U/O-flow: 0.0 / 0.0 %DY U/O-flow:
0 0 0 dR_leps 11 -18 -60 -43 -45 11 10 100 -16 -24 -10 1 -29
50 100 150 200 250 300 350 400 450 50 100 150 200 250 300 350 400 100 200 300 400 500 20
bpt0 leppt0 zpt0 dR_bjets -32 6 5 8 7 -55 -61 100 -24 -39 -18 -25 -41
0
Signal U/O-flow: 0.0 / 1.5 %DY U/O-flow: 0.0 / 0.0 %
dN / 13
dN / 17.6
dN / 16.6
TMVA response for classifier: BDT TMVA response for classifier: BDT TMVA response for classifier: BDT
50
(1/N) dN / dx
(1/N) dN / dx
(1/N) dN / dx
Signal Signal (training sample) Signal Signal (training sample) Signal Signal (training sample)
TT TT (training sample) 45 TT TT (training sample) 45 TT TT (training sample)
DY DY (training sample) DY DY (training sample) DY DY (training sample)
40 40
40
35 35
30 30
30
25 25
20 20 20
15 15
10 10 10
5 5
0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
BDT response for Signal BDT response for DY BDT response for TT
Conclusions
Analysis strategy is clear binned shape analysis using Higgs Combine
1. BDTs:
OPEN questions
Can BDT to optimize selection (n b-jets, Zll, Hbb, metpt etc) contain
variables to be used later for BDT to derive shape (made of e.g.,
HH_mt, ZH_mt2, btag0, etc) for Combine?
Highly correlated and low rank variables in the BDT shape for
Combine? Drop or keep?
BACK UP
Building HH candidate, v1
At least 2 b-jets
Exactly 2 leptons