You are on page 1of 50

ATLAS NOTE

29th July 2016

Review of the ATLAS Open Data Dataset

The ATLAS Collaboration

Abstract
ATL-OREACH-PUB-2016-001

The ATLAS collaboration is releasing to the public for educational purposes 1fb1 of real
data at a centre-of-mass energy of 8 TeV from the 2012 data taking period. This set of real
data is accompanied by matching simulated data of several Standard Model processes and a
Beyond the Standard Model signal. Analysis tools are provided to make analysis of the data
easily accessible. The purpose of the data and tools released is to enable users to experience
03 August 2016

the analysis of particle physics data in a simplified environment, for example, in lab courses
or as an extension of physics masterclasses.
This document summarises the properties of the ATLAS open data dataset and the analysis
tools. In addition, example analyses intended as starting points for further analysis work by
users are shown and their results reviewed.

2016 CERN for the benefit of the ATLAS Collaboration.


Reproduction of this article or parts of it is allowed as specified in the CC-BY-4.0 license.
Contents
1 Introduction 2
1.1 Datasets 3
1.2 Analysis Tools 4

2 Example Analyses 5
2.1 W Analysis 7
2.2 Z Analysis 12
2.3 Top Quark Pair Analysis 19
2.4 W Z Analysis 24
2.5 Z Z Analysis 29
2.6 H WW Analysis 33
2.7 Z 0 Analysis 40

3 Summary 45

Appendix 46

1 Introduction

The ATLAS collaboration is releasing an official dataset, open to the public for educational use only,
following the guidelines of the ATLAS Data Access Policy [1]. The dataset consists of real data with an
integrated luminosity (1.0 0.0019) fb1 and a centre-of-mass energy of 8 TeV with matching simulated
data. The dataset is intended to provide the means for doing hands-on particle physics exercises in the
context of higher education, for example laboratory courses or introductory exercises for undergraduate
students. The released data may also prove beneficial for the production of teaching materials, for lectures,
and public talks. Furthermore, it may be used by people with data analysis experience but not necessarily
a physics background as a test dataset for studying and developing analysis techniques. The Kaggle Higgs
Boson Machine Learning Challenge [2] has demonstrated the viability of this application scenario.
The released data is provided in a simplified format to reduce the complexities of a full-scale analysis,
decrease the processing time, and facilitate code development. Analysis code is written in Python and
several example analyses are available as a starting point for further work. The technical details about
the dataset are discussed in the following sections. Section 1.1 details the preselection of the datasets
and explains the simplifications that have been made in comparison with a physics analysis. Section 2
describes the example analyses, which are shown afterwards.
The example analyses include a single W boson, Z boson, and top quark pair production analysis, all of
which have sufficiently high event yields to study the processes in detail. Analyses for processes with
lower production cross sections, namely W Z, Z Z, and H WW , are used to illustrate the statistical
limitations of the dataset. Finally, a Z 0 analysis is included to allow searches for new physics, again with
an emphasis on the educational character of the exercise.

2
1.1 Datasets

The ATLAS open data dataset is comprised of real data recorded with the ATLAS detector in 2012
and matching simulated data. Both real and simulated data are subjected to a loose event preselection to
reduce processing time by reducing the overall number of events that have to be analysed. The preselection
consists of a set of object selection criteria listed in Table 1. A further event selection is applied to these
selected objects, defined by the following criteria:

Corrupted event protection;


Single lepton trigger satisfied;
Veto on events containing bad jets. Bad jets are jets not associated to energy deposits in the
calorimeters from particles originating from the primary pp collision. They arise from various
sources, ranging from LHC beam conditions and cosmic-ray showers;
Primary vertex cut (Ntracks > 4);
At least one preselected lepton with pT > 25 GeV.

The selected events are available in a simplified data format reducing the information content of the
original data analysis format used in ATLAS. The resulting format is a TTree with 45 branches as detailed
in Table 3. The layout is optimised towards simplicity to reduce the complexities encountered in a full-
scale analysis, emphasising the educational character of the dataset. The framework used by the ATLAS
top analysis group, called AnalysisTop", was used to derive the simplified structure.
The set of real data has an integrated luminosity (1.0007 0.019) fb1 and a centre-of-mass energy of
8 TeV. Events from the Egamma and Muon streams from runs 207490, 207532, 207582, 207589, 207749,
207772, 207845, 207865, 207934, 207982, 208126, 208184, 208189, and 208258 are selected. All runs

electrons muons jets


reconstruction author 1||3 Muid combined antiKt4LCTopo
medium++ quality tight quality jet cleaning (veto BadLooseMinus)
pT > 5 GeV pT > 5 GeV pT > 25 GeV
|| < 2.47 w/o crack || < 2.5 || < 2.5
Object Quality is Good MCP Hit requirement.
|z0 | < 2.0 mm |z0 | < 2.0 mm
not Converted

Table 1: Preselection requirements for electrons, muons and jets, as applied in the ATLAS top group analysis frame-
work. The electron reconstruction algorithm is either calorimeter-based (author value 1) or both calorimeter and
track based (author value 3). The term w/o crack refers to the so-called crack region located at 1.37 < || < 1.52
where detector performance is degraded. Electron candidates found in this region are discarded. In addition, the
term not converted denotes that electrons which have been identified as originating from photon conversion are
not considered further. The Muid muon reconstruction algorithm package is used and muons are required to satisfy
the set of requirements on detector hits issued by the Muon Combined Performance (MCP) group. Jet reconstruction
is carried out by the anti-k t clustering algorithm, with clusters identified by energy in calorimeter cells. Jet cleaning
is applied. Jets are labelled as bad if they are not associated to energy deposits in the calorimeters from particles
originating from pp collisions.

3
belong to period D of the 2012 data taking and form the input for the preselection resulting in the dataset
summarised in Table 4.
The simulated datasets used in the data release are shown in Table 5 and Table 6. The same preselection
as for real data is applied. A reduction procedure has also been applied to samples with very high
initial statistics. The aim of the procedure is to lower the processing time by reducing the number of
preselected events in the sample while retaining enough statistics for meaningful comparisons between
real and simulated data. The number of reduced events and the resulting luminosities are listed in Table 5
and Table 6. In cases where very large datasets were available, e.g. Z ee, only a subset of the full
dataset was processed.
An important aspect of the samples is that they were prepared specifically for educational purposes. To
this end, precision has been traded for simplicity of use. The simplifications are:

No facilities to estimate systematic uncertainties have been included as these quickly introduce large
complexities. This is of special importance as some variables may show discrepancies when only
considering the statistical uncertainties, especially in high statistics analyses.
Scale factors implementing corrections for electrons and muons are calculated using the preselection
strategy of the AnalysisTop framework. This object selection does not have to coincide with the
actual object selection defined by the user. Therefore, discrepancies may arise due to non-matching
object definitions.
The b-tagging scale factor is computed for a specific working point for a specific b-tagging algorithm
(MV1@70% efficiency). The user, however, is free to specify the b-tagging weight used for tagging
jets. This introduces a potential mismatch between real and simulated data because the working
point and algorithm considered in the scale factor calculation differ from the ones being actually
applied.
No QCD samples were prepared as they would have been insufficient in statistics while introducing
a large set of additional samples. The contributing effects of QCD may be countered using strict
object definitions. However, analyses such as the W boson analysis may still suffer from the omission
of these samples.
The description of the W boson properties in simulated W +jets events is not ideal. The AnalysisTop
framework provides scale factors to correct for these issues. These corrections are only available for
samples produced with alpgen but not for those produced with Sherpa. However, using alpgen
would have introduced a prohibitively large number of samples. Therefore, Sherpa was used
although no corrections for the W boson modelling are provided for it.
The simulated data take into account the pile-up and vertex position profile of the whole 2012 data
taking although the real data is taken from a small list of runs from period D. This introduces a
certain mismatch regarding the number of vertices and the primary vertex position.

1.2 Analysis Tools

The ATLAS open data dataset is accompanied by a set of analysis tools written in Python interfaced with
ROOT [3]. These tools implement the protocols needed for reading the files, writing out histograms and
plotting results. Ease of use and a clear structure of the tools is emphasised. Several example analyses are

4
electrons & muons jets
pT > 25 GeV pT > 25 GeV
ptconerel30 < 0.15 Jet Vertex Fraction cut
etconerel20 < 0.15

Table 2: Standard selection on objects applied in the ATLAS open data tools. This standard selection is intended
as a starting point for customised object selections implemented by the user. Unless stated otherwise this standard
selection is applied in each of the example analyses. Jet Vertex Fraction (JVF) is a measure used to suppress jets
from pp collisions additional to the primary pp collision. It uses information on the track to vertex association to
evaluate which fraction of tracks associated with the jet stems from the primary vertex. Requirements are placed on
the relative transverse energy isolation (etconerel20) and the relative transverse momentum isolation (ptconerel30).

provided and are intended to be starting points for further development. Full documentation on the tools
is provided as a gitbook in an online resource.

2 Example Analyses

Example analyses and their results are shown in the following sections alongside the selection criteria
specific to the analysis at hand. In most analyses a standard object selection, detailed in Table 2 is applied
on top of the preselection detailed in Table 1 as described in Section 1.1. This standard object selection
is intended as a starting point for a more optimised object selection and serves primarily as a common
ground for the subsequent event selections of the individual analyses. In real data, the event is required to
satisfy quality constraints defined in the Good Run List (GRL) to ensure only high quality data is used for
physics measurements.
The purpose of these example analyses is to showcase the abilities and limitations of the real and simulated
datasets included in the ATLAS open data release. These analyses are grouped as follows:

Three high statistics Standard Model analyses have been implemented: a selection of events with
one W boson decaying to leptons, a selection of a Z boson decaying to a lepton pair, and a selection
of top quark pairs resulting in the final state ` j j j j. These analyses are intended to show that the
general description of the data for these important Standard Model processes is sound. They also
enable the study of Standard Model observables, such as the mass of the Z boson. Observable
discrepancies between data and simulation are due to the simplified nature of the ntuples.
Three low statistics Standard Model analyses are presented showing the limitations of the ATLAS
open data dataset with respect to rarer processes. They are a W Z analysis, a Z Z analysis, and a
H WW analysis. Although it is still possible to obtain results in these analyses and achieve
educational objectives, the statistical limitations prohibit more meaningful analyses. This point is
particularly important as it demonstrates that the proposed datasets are intended for educational
purposes only.
A Z 0 t t analysis serves as an example for a beyond the Standard Model (BSM) analysis. Multiple
samples of simulated data containing Z 0 signal events are provided to implement a simplified analysis
for searching for new physics.

5
The analysis plots in the following sections contain the ratio of real data to simulated Monte-Carlo data,
to give an understanding of the quality of simulated data modelling. These are labelled Data/MC. In case
two leptons are present in the final state they are ordered by transverse momentum with the leading one
labelled leading and the subleading one labelled trailing.
The list of example analyses is not exhaustive. Further processes that may be explored include WW
production, dileptonic top quark pair production, single top production, and many others.

6
2.1 W Analysis

This analysis is intended to provide an example for a high statistics analysis using the ATLAS open data
dataset. Furthermore it tests the description of the real data by the simulated W boson data, which is the
most limited process in terms of available Monte-Carlo statistics. An interesting variable to study would
be the ratio W + /W and its dependence on the pseudorapidity of the selected lepton. This would be a
direct extension of the physics examined in the W -path of the ATLAS Masterclasses [4].
This analysis implements the criteria for single W boson events with the W boson decaying to leptons.

It is based loosely on the charge asymmetry measurement carried out at s = 7 GeV [5]. The standard
object selection criteria (see Table 2) are applied. The event selection criteria are:
Single electron or muon trigger is satisfied;
Event in real data passes the Good Run List;
Event has a good vertex (Ntracks > 4);
Exactly one good lepton1 with pT > 25 GeV;
ETmiss > 30 GeV;
MTW > 30 GeV.
The W analysis is potentially prone to QCD contributions as there is only one lepton present which may
come from non-prompt sources mimicking the desired final state. Therefore, potential disagreements
must always be understood as a sign that the QCD contributions are not taken into account. QCD samples
are not provided as these have very low statistics after a selection while having a large file size.
The distributions of the transverse mass2 as well as the missing transverse momentum shown in Figure 1 are
affected by the omission of QCD contributions, which predominantly populate the low missing transverse
momentum and low transverse mass regions. A comparison of results obtained here to those of W +jets
analyses considering the impact of QCD processes supports this explanation [6].
The histograms depicting the vertex information in Figure 2 show the expected disagreement between
simulated and real data. The pile-up treatment in simulated data considered the whole 2012 run period
whilst the real data is taken only from period D of the 2012 data taking.
The overall description of the lepton kinematics by the simulated data is good as can be seen in Figure 3.
The figure also depicts the type of lepton expressed using the absolute value of the PDG id [7]. Elec-
trons/positrons have a PDG id of 11/-11 whereas muons/antimuons are denoted with a PDG id of 13/-13.
Less well described are the tracking and isolation variables shown in Figure 4. Here, the rise of the ratio
between data and simulation at higher isolation values suggests that QCD contributions are missing. In
this region QCD processes would contribute by either the misidentification of a jet as a lepton or by a
hadron decay to leptons inside a jet. These so-called non-prompt leptons are not well isolated resulting in
higher values for the isolation variables shown.
Figure 5 depicts the kinematics, jet vertex fraction, and the MV1 b-tagging weight of the selected jets.
Here, a slightly larger normalisation offset between real and simulated data is observed. This again may
be attributed to missing QCD contributions as the one jet bin would most likely be populated by dijet

1 When describing selections, lepton refersqto an electron or muon candidate.


The transverse mass is defined as: mT = 2pT` ETmiss 1 cos `, ETmiss .
f    g
2

7
events, where one jet is either misidentified as the lepton or supplies a non-prompt lepton and the other
counts towards the jet multiplicity. Apart from the normalisation issue the variables are reasonably well
described by the simulated data.

106 103

Events
Events

2.4 ATLAS Open Data Diboson


900 ATLAS Open Data Diboson
2.2 DrellYan DrellYan
W 800 W
2 Z Z
1.8 stop 700 stop
ttbar ttbar
1.6 Data 600 Data

1.4 500
1.2
1 400
0.8 300
0.6 200
0.4
0.2 100

0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
ET,Miss [GeV] MT,W [GeV]
Data/MC

Data/MC

1.5 1.5
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
EMiss
T [GeV] mW
T [GeV]

Figure 1: W Analysis: Event variable histograms. The variables plotted are the missing transverse momentum ETmiss
and the transverse mass of the W boson candidate mWT .

103 103
Events

Events

ATLAS Open Data Diboson ATLAS Open Data Diboson


700 DrellYan 500 DrellYan
W W
600 Z Z
stop stop
ttbar 400 ttbar
500 Data Data

400 300

300
200
200
100
100

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Data/MC

Data/MC

1.5 Nvertex 1.5 zVertex [mm]


1 1
0.5 0.5

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Nvertex zVertex [mm]

Figure 2: W Analysis: Vertex histograms. The number of vertices Nvertex and the z coordinate of the primary vertex
zvertex are shown.

8
106 103
3.5

Leptons
Leptons
ATLAS Open Data Diboson 450 ATLAS Open Data Diboson
DrellYan DrellYan
3 W 400 W
Z Z

2.5 stop
ttbar
350 stop
ttbar
Data Data
300
2
250
1.5 200

1 150
100
0.5
50

0 20 40 60 80 100 120 140 160lep180 200 3 2 1 0 1 2 3


lep
Data/MC

Data/MC
1.5 p [GeV] 1.5
T
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


plep [GeV] lep
T

103 103
Leptons

350 ATLAS Open Data Diboson


Leptons 1000 ATLAS Open Data Diboson
DrellYan DrellYan
W W
300 Z Z
stop 800 stop
ttbar ttbar
250 Data Data
600
200

150 400
100
200
50

3 2 1 0 1 2 3 0 50 100 150 200 250 300


lep lep

Data/MC

Data/MC

1.5 1.5 E [GeV]


1 1
0.5 0.5

3 2 1 0 1 2 3 0 50 100 150 200 250 300


lep lep
E [GeV]
106 106
Leptons

Leptons

4 ATLAS Open Data Diboson 6 ATLAS Open Data Diboson


DrellYan DrellYan
3.5 W W
Z 5 Z
3 stop stop
ttbar ttbar

2.5 Data 4 Data

2 3
1.5
2
1
1
0.5

0 5 10 15 20 25 30 1.5 1 0.5 0 0.5 1 1.5


lep lep
Data/MC

Data/MC

1.5 |PDG ID| 1.5 Q


1 1
0.5 0.5

0 5 10 15 20 25 30 1 0 1
lep lep
|PDG id| Q

Figure 3: W Analysis: Leading lepton properties. From upper left to lower right are shown: transverse momentum
pT , pseudorapidity , azimuthal angle , energy E, absolute value of the PDG id |PDG id|, and charge Q.

9
106 106
Leptons

Leptons
2.4 ATLAS Open Data Diboson ATLAS Open Data Diboson
2.2 DrellYan
5
DrellYan
W W
2 Z Z
1.8 stop stop
ttbar 4 ttbar
1.6 Data Data

1.4
3
1.2
1
0.8 2
0.6
0.4 1
0.2
0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
lep lep
Data/MC

Data/MC

1.5 etconerel20 1.5 ptconerel30


1 1
0.5 0.5

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
lep lep
etconerel20 ptconerel30
106 106
Leptons

Leptons

1.6 ATLAS Open Data Diboson ATLAS Open Data Diboson


DrellYan 3 DrellYan
1.4 W W
Z Z
1.2 stop 2.5 stop
ttbar ttbar
Data Data
1 2
0.8
1.5
0.6
1
0.4
0.2 0.5

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
lep
zlep
Data/MC

Data/MC

1.5 0 [mm] 1.5 d0 [mm]


1 1
0.5 0.5

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
zlep
0 [mm]
lep
d0 [mm]

Figure 4: W Analysis: Leading lepton isolation and tracking information. From upper left to lower right are
shown: relative transverse energy isolation (etconerel20), relative transverse momentum isolation (ptconerel30),
longitudinal impact parameter z0 , and transverse impact parameter d 0 .

10
Events 106 103

Jets
ATLAS Open Data Diboson ATLAS Open Data Diboson
5 DrellYan
500 DrellYan
W W
Z Z
4 stop
400
stop
ttbar ttbar
Data Data

3 300

2 200

1 100

0 1 2 3 4 5 6 7 8 9 0 20 40 60 80 100 120 140 160jet180 200


Njets
Data/MC

Data/MC
1.5 1.5 p [GeV]
T
1 1
0.5 0.5

0 1 2 3 4 5 6 7 8 9 0 20 40 60 80 100 120 140 160 180 200


Njets pjet [GeV]
T

103 103
Jets

160 ATLAS Open Data Diboson Jets ATLAS Open Data Diboson
DrellYan 350 DrellYan
140 W W
Z 300 Z
120 stop stop
ttbar ttbar
Data 250 Data
100
200
80
60 150

40 100

20 50

3 2 1 0 1 2 3 0 2 4 6 8 10 12 14 16 18 20
jet mjet [GeV]
Data/MC

Data/MC

1.5 1.5
1 1
0.5 0.5

3 2 1 0 1 2 3 0 2 4 6 8 10 12 14 16 18 20
jet mjet [GeV]

103 106
Jets

Jets

600 ATLAS Open Data Diboson 1.6 ATLAS Open Data Diboson
DrellYan DrellYan

500
W 1.4 W
Z Z
stop stop
ttbar
1.2 ttbar
400 Data Data
1
300 0.8
0.6
200
0.4
100
0.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Data/MC

Data/MC

1.5 JVF 1.5 MV1 weight


1 1
0.5 0.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
JVF MV1 weight

Figure 5: W Analysis: Jet properties. From upper left to lower right are shown: Jet multiplicity Njets , transverse
momentum pT , pseudorapidity , mass m, jet vertex fraction (JVF) and MV1 b-tagging weight of the selected jets.

11
2.2 Z Analysis

Many analyses selecting leptons suffer from Z+jets as a contributing background due to its large production
cross section. It is therefore vital to check the correct modelling of this process by the simulated data.
In addition, the exercises of the ATLAS Z-path Masterclasses [8] may be extended by this example
analysis.
The Z boson analysis implemented here considers Z boson decays into an electron positron or muon
antimuon pair. The standard object selection criteria (see Table 2) are applied. The event selection criteria
are:
Single electron or muon trigger is satisfied;
Event in real data passes the Good Run List;
Event has a good vertex (Ntracks > 4);
Exactly two good leptons with pT > 25 GeV;
Leptons have opposite charge;
Leptons have same flavour;
|m`` m Z | < 20 GeV with m Z = 91.18 GeV.
The modelling of the lepton kinematics by the simulated data is very good, as can be seen in Figures 6
and Figure 8. The isolation and tracking information shown in Figure 7 and Figure 9 are also described
well. Figure 10 summarises the jet information. The jet multiplicity shows a slight disagreement in the
higher jet bins, which would be covered by systematic uncertainties. This leads to a slight disagreement
in normalisation for the jet histograms. Nonetheless, the description of the real data by the simulated data
is very good when comparing shapes.
The histograms depicting the vertex information in Figure 11 show the expected disagreement between
simulated and real data. The pile-up treatment in simulated data considers the whole 2012 run period
whilst the real data is taken only from period D of the 2012 data taking. Figure 12 depicts the invariant
mass of the reconstructed Z boson candidate which shows excellent agreement between real and simulated
data. The poor modelling of ETmiss is due to the complexities of simulating missing transverse momentum
in the absence of an actual neutrino from the hard scattering.

12
103 103
50

Leptons
Leptons
ATLAS Open Data Diboson ATLAS Open Data Diboson
300 DrellYan DrellYan
W W
Z 40 Z
250 stop stop
ttbar ttbar
Data Data
200 30

150
20
100
10
50

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


pleadlep [GeV] leadlep
Data/MC

Data/MC
1.5 T
1.5
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


pleadlep [GeV] leadlep
T

103 103
40
Leptons

ATLAS Open Data Diboson


Leptons 100 ATLAS Open Data Diboson
DrellYan DrellYan
35 W W
Z Z
30 stop 80 stop
ttbar ttbar
Data Data
25
60
20
15 40

10
20
5

3 2 1 0 1 2 3 0 50 100 150 200 250 300


leadlep leadlep

Data/MC

Data/MC

1.5 1.5 E [GeV]


1 1
0.5 0.5

3 2 1 0 1 2 3 0 50 100 150 200 250 300


leadlep leadlep
E [GeV]
103 103
Leptons

Leptons

600 600
ATLAS Open Data Diboson ATLAS Open Data Diboson
DrellYan DrellYan
W W
500 Z 500 Z
stop stop
ttbar ttbar
400 Data 400 Data

300 300

200 200

100 100

0 5 10 15 20 25 30 1.5 1 0.5 0 0.5 1 1.5


leadlep leadlep
Data/MC

Data/MC

1.5 |PDG ID| 1.5 Q


1 1
0.5 0.5

0 5 10 15 20 25 30 1 0 1
leadlep leadlep
|PDG id| Q

Figure 6: Z Analysis: Leading lepton properties. From upper left to lower right are shown: transverse momentum
pT , pseudorapidity , azimuthal angle , energy E, absolute value of the PDG id |PDG id|, and charge Q.

13
103 103
300
Leptons

Leptons
ATLAS Open Data Diboson ATLAS Open Data Diboson
DrellYan DrellYan
250 W 500 W
Z Z
stop stop

200 ttbar
Data
400 ttbar
Data

150 300

100 200

50 100

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
leadlep leadlep
Data/MC

Data/MC

1.5 etconerel20 1.5 ptconerel30


1 1
0.5 0.5

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
leadlep leadlep
etconerel20 ptconerel30
103 103
350
Leptons

Leptons

160 ATLAS Open Data Diboson ATLAS Open Data Diboson


DrellYan DrellYan
W 300 W
140 Z Z
stop stop
120 ttbar 250 ttbar
Data Data
100 200
80
150
60
100
40
20 50

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
leadlep
zleadlep
Data/MC

Data/MC

1.5 0 [mm] 1.5 d0 [mm]


1 1
0.5 0.5

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
zleadlep
0 [mm] d0
leadlep
[mm]

Figure 7: Z Analysis: Leading lepton isolation and tracking information. From upper left to lower right are
shown: relative transverse energy isolation (etconerel20), relative transverse momentum isolation (ptconerel30),
longitudinal impact parameter z0 , and transverse impact parameter d 0 .

14
103 103

Leptons
Leptons
ATLAS Open Data Diboson ATLAS Open Data Diboson
300 DrellYan 45 DrellYan
W W
Z 40 Z
250 stop
35
stop
ttbar ttbar
Data Data
200 30
25
150
20
100 15
10
50
5

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


ptraillep [GeV] traillep
Data/MC

Data/MC
1.5 T
1.5
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


ptraillep [GeV] traillep
T

103 103
40 140
Leptons

ATLAS Open Data Diboson


Leptons ATLAS Open Data Diboson
DrellYan DrellYan
35 W 120 W
Z Z
30 stop
100 stop
ttbar ttbar
Data Data
25
80
20
60
15
40
10
5 20

3 2 1 0 1 2 3 0 50 100 150 200 250 300


traillep traillep

Data/MC

Data/MC

1.5 1.5 E [GeV]


1 1
0.5 0.5

3 2 1 0 1 2 3 0 50 100 150 200 250 300


traillep traillep
E [GeV]
103 103
Leptons

Leptons

600 600
ATLAS Open Data Diboson ATLAS Open Data Diboson
DrellYan DrellYan
W W
500 Z 500 Z
stop stop
ttbar ttbar
400 Data 400 Data

300 300

200 200

100 100

0 5 10 15 20 25 30 1.5 1 0.5 0 0.5 1 1.5


traillep traillep
Data/MC

Data/MC

1.5 |PDG ID| 1.5 Q


1 1
0.5 0.5

0 5 10 15 20 25 30 1 0 1
traillep traillep
|PDG id| Q

Figure 8: Z Analysis: Trailing lepton properties. From upper left to lower right are shown: transverse momentum
pT , pseudorapidity , azimuthal angle , energy E, absolute value of the PDG id |PDG id|, and charge Q.

15
103 103
Leptons

Leptons
250 ATLAS Open Data Diboson ATLAS Open Data Diboson
DrellYan DrellYan
W
500 W
Z Z
200 stop stop
ttbar 400 ttbar
Data Data

150
300

100 200

50 100

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
traillep traillep
Data/MC

Data/MC

1.5 etconerel20 [GeV] 1.5 ptconerel30 [GeV]


1 1
0.5 0.5

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
traillep traillep
etconerel20 [GeV] ptconerel30 [GeV]
103
103
350
Leptons

Leptons

160 ATLAS Open Data Diboson ATLAS Open Data Diboson


DrellYan DrellYan
W 300 W
140 Z Z
stop stop
120 ttbar
250 ttbar
Data Data
100 200
80
150
60
100
40
20 50

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
traillep
ztraillep
Data/MC

Data/MC

1.5 0 [mm] 1.5 d0 [mm]


1 1
0.5 0.5

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
ztraillep
0 [mm] d0
traillep
[mm]

Figure 9: Z Analysis: Trailing lepton isolation and tracking information. From upper left to lower right are
shown: relative transverse energy isolation (etconerel20), relative transverse momentum isolation (ptconerel30),
longitudinal impact parameter z0 , and transverse impact parameter d 0 .

16
103 103
45
Events

Jets
ATLAS Open Data Diboson ATLAS Open Data Diboson
500 DrellYan 40 DrellYan
W W
Z
35 Z
400 stop stop
ttbar ttbar
Data
30 Data

300 25
20
200
15
10
100
5

0 1 2 3 4 5 6 7 8 9 0 20 40 60 80 100 120 140 160jet180 200


Njets
Data/MC

Data/MC
1.5 1.5 p [GeV]
T
1 1
0.5 0.5

0 1 2 3 4 5 6 7 8 9 0 20 40 60 80 100 120 140 160 180 200


Njets pjet [GeV]
T

103 103
Jets

12 ATLAS Open Data Diboson Jets 30 ATLAS Open Data Diboson


DrellYan DrellYan
W W
10 Z 25 Z
stop stop
ttbar ttbar
8 Data 20 Data

6 15

4 10

2 5

3 2 1 0 1 2 3 0 2 4 6 8 10 12 14 16 18 20
jet mjet [GeV]
Data/MC

Data/MC

1.5 1.5
1 1
0.5 0.5

3 2 1 0 1 2 3 0 2 4 6 8 10 12 14 16 18 20
jet mjet [GeV]

103 103
Jets

Jets

ATLAS Open Data Diboson ATLAS Open Data Diboson


45 DrellYan
120 DrellYan
W W
40 Z Z

35 stop 100 stop


ttbar ttbar
Data Data
30 80
25
20 60

15 40
10
20
5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Data/MC

Data/MC

1.5 JVF 1.5 MV1 weight


1 1
0.5 0.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
JVF MV1 weight

Figure 10: Z Analysis: Jet properties. From upper left to lower right are shown: Jet multiplicity Njets , transverse
momentum pT , pseudorapidity , mass m, jet vertex fraction (JVF) and MV1 b-tagging weight of the selected jets.

17
103 103
Events

Events
70 ATLAS Open Data Diboson ATLAS Open Data Diboson
DrellYan 50 DrellYan
W W
60 Z Z
stop 40 stop
ttbar ttbar
50 Data Data

40 30

30
20
20
10
10

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Data/MC

Data/MC
1.5 Nvertex 1.5 zVertex [mm]
1 1
0.5 0.5

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Nvertex zVertex [mm]

Figure 11: Z Analysis: Vertex histograms. The number of vertices Nvertex and the z coordinate of the primary vertex
zvertex are shown.

103 103
Events
Events

220 ATLAS Open Data Diboson 160 ATLAS Open Data Diboson
DrellYan DrellYan
200 W W
Z
140 Z
180 stop stop
160 ttbar 120 ttbar
Data Data
140 100
120
80
100
80 60
60 40
40
20 20

0 20 40 60 80 100 120 140 160 180 200 60 70 80 90 100 110 120


ET,Miss [GeV]
Data/MC

Data/MC

1.5 1.5 Mll [GeV]


1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 60 70 80 90 100 110 120


EMiss
T [GeV] mll [GeV]

Figure 12: Z Analysis: Event variable histograms. The variables plotted are the missing transverse momentum
ETmiss and the invariant mass of the Z boson candidate m`` .

18
2.3 Top Quark Pair Analysis

The LHC is a top quark factory and studying and understanding top quark physics is one of the major goals
of the ATLAS physics programme. This understanding is crucial for studying rarer processes as top quark
pair production is a background to virtually all processes having leptons and multiple jets in their final
states. Top quark pair production can be studied in the ATLAS open data dataset in both the semileptonic
and dileptonic final state. Statistics are expected to be sufficient for producing detailed distributions and
exploring advanced techniques like the reconstruction of the top quark pair system.
This analysis mimics a standard top quark pair selection in the semileptonic channel. The standard object
selection criteria (see Table 2) are applied. The event selection is defined as:
Single electron or muon trigger is satisfied;
Event in real data passes the Good Run List;
Event has a good vertex (Ntracks > 4);
Exactly one good lepton with pT > 25 GeV;
At least four good jets;
At least two b-tagged jets (MV1@70%);
ETmiss > 30 GeV;

T > 30 GeV.
mW
There is a small disagreement in normalisation of approximately 5 % between simulated and real data.
This may be attributed to the fact that the b-tagging scale factor is not applied despite two b-tags being
required. A survey of the shapes of all presented histograms reveals no obvious discrepancies.
Figure 13 depicts the lepton kinematics, type, and charge of the leptons with the simulated data reproducing
the real data well. Tracking and isolation related variables (see Figure 14) are not as well described but
show reasonable agreement in the high statistics regions. Figure 15 summarises the jet properties. Jet
multiplicity, jet kinematics, jet vertex fraction, and MV1 b-tagging weight are well described by the
simulated data.
The histograms depicting the vertex information in Figure 16 show the expected disagreement between
simulated and real data. The pile-up treatment in simulated data considers the whole 2012 run period
whilst the real data is taken only from period D of the 2012 data taking. The transverse mass of the W
boson candidate and the missing transverse momentum are shown in Figure 17. Both exhibit a flat ratio
between simulated and real data, indicating that no apparent mismodelling is present.

19
Leptons
Leptons
ATLAS Open Data Diboson ATLAS Open Data Diboson
1600 DrellYan DrellYan
W 500 W
1400 Z Z
stop stop
1200 ttbar 400 ttbar
Data Data

1000
300
800
600 200
400
100
200

0 20 40 60 80 100 120 140 160lep180 200 3 2 1 0 1 2 3


lep
Data/MC

Data/MC
1.5 p [GeV] 1.5
T
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


plep [GeV] lep
T
Leptons

500 ATLAS Open Data Diboson


Leptons 900 ATLAS Open Data Diboson
DrellYan DrellYan
W
800 W
Z Z
400 stop 700 stop
ttbar ttbar
Data 600 Data

300 500
400
200
300

100 200
100

3 2 1 0 1 2 3 0 50 100 150 200 250 300


lep lep

Data/MC

Data/MC

1.5 1.5 E [GeV]


1 1
0.5 0.5

3 2 1 0 1 2 3 0 50 100 150 200 250 300


lep lep
E [GeV]
Leptons

Leptons

ATLAS Open Data Diboson ATLAS Open Data Diboson


DrellYan
7000 DrellYan
5000 W W
Z 6000 Z
stop stop
4000 ttbar ttbar
Data 5000 Data

3000 4000

3000
2000
2000
1000
1000

0 5 10 15 20 25 30 1.5 1 0.5 0 0.5 1 1.5


lep lep
Data/MC

Data/MC

1.5 |PDG ID| 1.5 Q


1 1
0.5 0.5

0 5 10 15 20 25 30 1 0 1
lep lep
|PDG id| Q

Figure 13: t t Analysis: Leading lepton properties. From upper left to lower right are shown: transverse momentum
pT , pseudorapidity , azimuthal angle , energy E, absolute value of the PDG id |PDG id|, and charge Q.

20
Leptons

Leptons
3000 ATLAS Open Data Diboson ATLAS Open Data Diboson
DrellYan 6000 DrellYan
W W
2500 Z Z
stop 5000 stop
ttbar ttbar
2000 Data Data
4000
1500
3000
1000 2000

500 1000

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
lep lep
Data/MC

Data/MC

1.5 etconerel20 1.5 ptconerel30


1 1
0.5 0.5

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
lep lep
etconerel20 ptconerel30

2200
Leptons

Leptons

ATLAS Open Data Diboson 4000 ATLAS Open Data Diboson


2000 DrellYan DrellYan
W W
1800 Z
3500 Z
stop stop
1600 ttbar 3000 ttbar
1400 Data Data
2500
1200
1000 2000
800 1500
600
1000
400
200 500

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
lep
zlep
Data/MC

Data/MC

1.5 0 [mm] 1.5 d0 [mm]


1 1
0.5 0.5

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
zlep
0 [mm]
lep
d0 [mm]

Figure 14: t t Analysis: Leading lepton isolation and tracking information. From upper left to lower right are
shown: relative transverse energy isolation (etconerel20), relative transverse momentum isolation (ptconerel30),
longitudinal impact parameter z0 , and transverse impact parameter d 0 .

21
Events 4500

Jets
ATLAS Open Data Diboson ATLAS Open Data Diboson
4000 DrellYan 6000 DrellYan
W W
3500 Z Z
stop 5000 stop
ttbar ttbar
3000 Data Data
4000
2500
2000 3000
1500
2000
1000
1000
500

0 1 2 3 4 5 6 7 8 9 0 20 40 60 80 100 120 140 160jet180 200


Njets
Data/MC

Data/MC
1.5 1.5 p [GeV]
T
1 1
0.5 0.5

0 1 2 3 4 5 6 7 8 9 0 20 40 60 80 100 120 140 160 180 200


Njets pjet [GeV]
T
Jets

ATLAS Open Data Diboson Jets 4500


ATLAS Open Data Diboson
3000 DrellYan
4000 DrellYan
W W
Z Z
2500 stop
3500 stop
ttbar ttbar
Data 3000 Data
2000
2500
1500 2000

1000 1500
1000
500
500

3 2 1 0 1 2 3 0 2 4 6 8 10 12 14 16 18 20
jet mjet [GeV]
Data/MC

Data/MC

1.5 1.5
1 1
0.5 0.5

3 2 1 0 1 2 3 0 2 4 6 8 10 12 14 16 18 20
jet mjet [GeV]

103 103
22
Jets

Jets

ATLAS Open Data Diboson ATLAS Open Data Diboson


14 DrellYan 20 DrellYan
W W
Z
18 Z
12 stop 16 stop
ttbar ttbar
10 Data 14 Data

8 12
10
6 8
4 6
4
2
2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Data/MC

Data/MC

1.5 JVF 1.5 MV1 weight


1 1
0.5 0.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
JVF MV1 weight

Figure 15: t t Analysis: Jet properties. From upper left to lower right are shown: Jet multiplicity Njets , transverse
momentum pT , pseudorapidity , mass m, jet vertex fraction (JVF) and MV1 b-tagging weight of the selected jets.

22
Events

Events
800 ATLAS Open Data Diboson ATLAS Open Data Diboson
DrellYan DrellYan
600
700 W W
Z Z

600 stop 500 stop


ttbar ttbar
Data Data
500 400
400
300
300
200
200
100 100

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Data/MC

Data/MC
1.5 Nvertex 1.5 zVertex [mm]
1 1
0.5 0.5

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Nvertex zVertex [mm]

Figure 16: t t Analysis: Vertex histograms. The number of vertices Nvertex and the z coordinate of the primary vertex
zvertex are shown.
Events
Events

1200 ATLAS Open Data Diboson ATLAS Open Data Diboson


DrellYan
600 DrellYan
W W
1000 Z Z
stop
500 stop
ttbar ttbar
800 Data Data
400
600 300

400 200

200 100

0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
ET,Miss [GeV] MT,W [GeV]
Data/MC

Data/MC

1.5 1.5
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
EMiss
T [GeV] mW
T [GeV]

Figure 17: t t Analysis: Event variable histograms. The variables plotted are the missing transverse momentum
ETmiss and the transverse mass of the W boson candidate mW
T .

23
2.4 W Z Analysis

Diboson physics is an important part of the physics programme of ATLAS as it is a probe for electroweak
physics. It enables tests of key predictions of the electroweak theory like the self-coupling of the
electroweak gauge bosons. The W Z analysis was chosen as an example analysis for the ATLAS open data
tools. It is one of the most abundantly produced diboson processes and has a clean final state consisting
of three charged leptons and a neutrino. Reconstructing the W Z system and studying its properties
is possible, but introduces a slight challenge due to the neutrino which may be seen as a interesting
educational challenge. The available statistics in the ATLAS open data dataset allows for a rediscovery of
the W Z process in a lab course.
This analysis is abridged from the W Z analysis in the fully leptonic channel as it is carried out by ATLAS
using the 2012 dataset [9]. The selected phase space is the one used for extracting the production cross
section of W Z. Although events fitting the W Z final state are present, their number is too small to draw
stringent conclusions in terms of real to simulated data agreement. The standard object selection criteria
(see Table 2) are applied. The event selection criteria are:
Single electron or muon trigger is satisfied;
Event in real data passes the Good Run List;
Event has a good vertex (Ntracks > 4);
Exactly three good leptons with pT > 25 GeV;
W Z candidate is chosen by finding the Z boson candidate closest to the nominal Z mass;
|m`` m Z | < 10 GeV with m Z = 91.18 GeV;

T > 30 GeV.
mW
The W Z signal is shown independently from the other diboson processes (WW and Z Z) in the figures in
this section. The overall description of the real data by the simulated data is reasonably good given the
statistical limitations. The kinematics of the three leptons are summarised in Figure 18 and show good
agreement. The isolation and tracking variables depicted in Figure 19 are equally well reproduced by
the simulated data. Due to the low statistics no jet histograms have been included as the expected yields
would be too low for a meaningful comparison. The vertex information depicted in Figure 20 as well as
the invariant mass of the Z boson candidate and the transverse mass of the W boson candidate shown in
Figure 21 do not show any major mismodelling.

24
Leptons
Leptons 80 ATLAS Open Data WZ 35 ATLAS Open Data WZ
Diboson Diboson
70 DrellYan DrellYan
W 30 W
60 stop stop
ttbar ttbar
Data
25 Data
50
20
40
15
30
20 10

10 5

0 20 40 60 80 100 120 140 160lep180 200 3 2 1 0 1 2 3


lep
Data/MC

Data/MC
1.5 p [GeV] 1.5
T
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


plep [GeV] lep
T

50
Leptons

25 ATLAS Open Data WZ


Diboson
Leptons ATLAS Open Data WZ
Diboson
DrellYan DrellYan
W 40 W
20 stop stop
ttbar ttbar
Data Data
30
15

10 20

5 10

3 2 1 0 1 2 3 0 50 100 150 200 250 300


lep lep

Data/MC

Data/MC

1.5 1.5 E [GeV]


1 1
0.5 0.5

3 2 1 0 1 2 3 0 50 100 150 200 250 300


lep lep
E [GeV]
Leptons

Leptons

140 ATLAS Open Data WZ ATLAS Open Data WZ


Diboson 160 Diboson
DrellYan DrellYan
120 W 140 W
stop stop

100 ttbar 120 ttbar


Data Data
100
80
80
60
60
40
40
20 20

0 5 10 15 20 25 30 1.5 1 0.5 0 0.5 1 1.5


lep lep
Data/MC

Data/MC

1.5 |PDG ID| 1.5 Q


1 1
0.5 0.5

0 5 10 15 20 25 30 1 0 1
lep lep
|PDG id| Q

Figure 18: W Z Analysis: Lepton properties. From upper left to lower right are shown: The transverse momentum
pT , pseudorapidity , azimuthal angle , energy E, absolute value of the PDG id |PDG id|, and charge Q of the
three leptons in the selected events.

25
Leptons

Leptons
180
ATLAS Open Data WZ ATLAS Open Data WZ
Diboson Diboson
100 DrellYan
160 DrellYan
W W
stop
140 stop
80 ttbar
120
ttbar
Data Data

100
60
80
40 60
40
20
20

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
lep lep
Data/MC

Data/MC

1.5 etconerel20 1.5 ptconerel30


1 1
0.5 0.5

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
lep lep
etconerel20 ptconerel30
Leptons

Leptons

70
ATLAS Open Data WZ ATLAS Open Data WZ
Diboson
90 Diboson
60 DrellYan
80 DrellYan
W W
stop stop
50 ttbar
70 ttbar
Data Data
60
40
50
30 40
20 30
20
10
10

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
lep
zlep
Data/MC

Data/MC

1.5 0 [mm] 1.5 d0 [mm]


1 1
0.5 0.5

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
zlep
0 [mm]
lep
d0 [mm]

Figure 19: W Z Analysis: Lepton isolation and tracking information. From upper left to lower right are shown:
relative transverse energy isolation (etconerel20), relative transverse momentum isolation (ptconerel30), longitudinal
impact parameter z0 , and transverse impact parameter d 0 of the three leptons in the selected events.

26
Events

ATLAS Open Data WZ Events ATLAS Open Data WZ


12 Diboson 10 Diboson
DrellYan DrellYan
W W
10 stop
8
stop
ttbar ttbar
Data Data
8
6
6
4
4

2 2

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Data/MC

Data/MC

1.5 Nvertex 1.5 zVertex [mm]


1 1
0.5 0.5

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Nvertex zVertex [mm]

Figure 20: W Z Analysis: Vertex histograms. The number of vertices Nvertex and the z coordinate of the primary
vertex zvertex are shown.

27
Events

Events
ATLAS Open Data WZ
24 ATLAS Open Data WZ

30 Diboson 22 Diboson
DrellYan DrellYan
W 20 W
25 stop 18 stop
ttbar ttbar
Data 16 Data
20 14
12
15 10
8
10
6
5 4
2
60 70 80 90 100 110 120 0 20 40 60 80 100 120 140 160 180 200
MT,W [GeV]
Data/MC

Data/MC

1.5 Mll [GeV] 1.5


1 1
0.5 0.5

60 70 80 90 100 110 120 0 20 40 60 80 100 120 140 160 180 200


mll [GeV] mW
T [GeV]
Events

25 ATLAS Open Data WZ


Diboson
DrellYan
20 W
stop
ttbar
Data
15

10

0 20 40 60 80 100 120 140 160 180 200


ET,Miss [GeV]
Data/MC

1.5
1
0.5

0 20 40 60 80 100 120 140 160 180 200


EMiss
T [GeV]

Figure 21: W Z Analysis: Event variable histograms. The variables plotted are invariant mass of the Z boson
candidate m`` , the transverse mass of the W boson candidate mW miss
T , and the missing transverse momentum ET .

28
2.5 Z Z Analysis

The production of Z Z with subsequent decay to leptons is the dominant Standard Model process with
four charged prompt leptons in the final state. Its low production cross section results in a very low yield
for the ATLAS open data dataset and its highlights the statistical limitations. Although some events can
be selected, the low event yield prohibits detailed analysis and conclusions drawn are rather qualitative in
nature.
The Z Z analysis implemented in the ATLAS open data tools selects events where both Z bosons decay

to leptons. It is based on the Z Z production cross section measurement carried out at s = 7 GeV [10].
The standard object selection criteria (see Table 2) are applied with a loosened lepton pT requirement of
pT > 10 GeV. The event selection criteria are:

Single electron or muon trigger is satisfied;


Event in real data passes the Good Run List;
Event has a good vertex (Ntracks > 4);
Exactly four good leptons with pT > 10 GeV;
Two Z candidates built from lepton pairs of same flavour and opposite charge minimising the total
deviation of both candidates from the Z boson mass;
|mZCand1 m Z | + |mZCand2 m Z | < 20 GeV with m Z = 91.18 GeV.

The Z Z signal is shown independently from the other diboson processes (WW and W Z) in the figures in
this section. The event yields in this analysis are particularly low, as can be seen in Figures 22 to 25, and
no stringent statement regarding the quality of the description of the real data by the simulated data can
be made. The histograms are included solely for illustrative purposes.
Events

Events

5 ATLAS Open Data ZZ ATLAS Open Data ZZ


Diboson Diboson
DrellYan 5 DrellYan
W W
4 stop stop
ttbar 4 ttbar
Data Data
3
3

2
2

1 1

60 70 80 90 100 110 120 60 70 80 90 100 110 120


Data/MC

Data/MC

1.5 MZ1 [GeV] 1.5 MZ2 [GeV]


1 1
0.5 0.5

60 70 80 90 100 110 120 60 70 80 90 100 110 120


mZ1 [GeV] mZ2 [GeV]

Figure 22: Z Z Analysis: Event variable histograms. The variables plotted are invariant masses of the two Z boson
candidates m z1 and m z 2.

29
Leptons
Leptons 12
18 ATLAS Open Data ZZ ATLAS Open Data ZZ
Diboson Diboson
16 DrellYan
W 10
DrellYan
W
stop stop
14 ttbar ttbar

12 Data 8 Data

10
6
8
6 4
4
2
2

0 20 40 60 80 100 120 140 160lep180 200 3 2 1 0 1 2 3


lep
Data/MC

Data/MC
1.5 p [GeV] 1.5
T
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


plep [GeV] lep
T
Leptons

Leptons
ATLAS Open Data ZZ 10 ATLAS Open Data ZZ
Diboson Diboson
10 DrellYan DrellYan
W W
stop
8 stop
8 ttbar ttbar
Data Data
6
6

4
4

2 2

3 2 1 0 1 2 3 0 50 100 150 200 250 300


lep lep

Data/MC

Data/MC

1.5 1.5 E [GeV]


1 1
0.5 0.5

3 2 1 0 1 2 3 0 50 100 150 200 250 300


lep lep
E [GeV]

30
Leptons

Leptons

ATLAS Open Data ZZ ATLAS Open Data ZZ


25 Diboson Diboson
DrellYan 25 DrellYan
W W
stop stop
20 ttbar ttbar
Data
20 Data

15
15

10 10

5 5

0 5 10 15 20 25 30 1.5 1 0.5 0 0.5 1 1.5


lep lep
Data/MC

Data/MC

1.5 |PDG ID| 1.5 Q


1 1
0.5 0.5

0 5 10 15 20 25 30 1 0 1
lep lep
|PDG id| Q

Figure 23: Z Z Analysis: Lepton properties. From upper left to lower right are shown: Transverse momentum pT ,
pseudorapidity , azimuthal angle , energy E, absolute value of the PDG id |PDG id|, and charge Q of the four
leptons in the selected events.

30
Leptons

Leptons
ATLAS Open Data ZZ ATLAS Open Data ZZ
Diboson
35 Diboson
25 DrellYan DrellYan
W 30 W
stop stop
20 ttbar ttbar
Data 25 Data

15 20

15
10
10
5
5

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
lep lep
Data/MC

Data/MC

1.5 etconerel20 1.5 ptconerel30


1 1
0.5 0.5

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
lep lep
etconerel20 ptconerel30

24
Leptons

Leptons

ATLAS Open Data ZZ ATLAS Open Data ZZ


16 Diboson
22 Diboson
DrellYan 20 DrellYan
14 W
18
W
stop stop
ttbar ttbar
12 Data
16 Data

10 14
12
8 10
6 8
4 6
4
2 2
1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
lep
zlep
Data/MC

Data/MC

1.5 0 [mm] 1.5 d0 [mm]


1 1
0.5 0.5

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
zlep
0 [mm]
lep
d0 [mm]

Figure 24: Z Z Analysis: Lepton isolation and tracking information. From upper left to lower right are shown: relative
transverse energy isolation (etconerel20), relative transverse momentum isolation (ptconerel30), longitudinal impact
parameter z0 , and transverse impact parameter d 0 of the four leptons in the selected events.

31
Events

4.5 ATLAS Open Data ZZ Events ATLAS Open Data ZZ


Diboson Diboson
4 DrellYan
3 DrellYan
W W
3.5 stop 2.5 stop
ttbar ttbar
3 Data Data
2
2.5
2 1.5
1.5 1
1
0.5
0.5

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Data/MC

Data/MC

1.5 Nvertex 1.5 zVertex [mm]


1 1
0.5 0.5

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Nvertex zVertex [mm]

Figure 25: Z Z Analysis: Vertex histograms. The number of vertices Nvertex and the z coordinate of the primary
vertex zvertex are shown.

32
2.6 H W W Analysis

The discovery of the Higgs in 2012 was one of the milestones of the LHC physics programme. The
H WW analysis is of special interest as it was one of the earliest analyses in ATLAS with sizeable
Higgs contributions. In addition, the analysis presented here was used to preselect a part of the data
used by the ATLAS W-path Masterclasses [4]. Thus, it represents a natural extension of the exercises
performed there.
This analysis implements the criteria for the selection of the zero jet bin of the H WW analysis with
both W bosons decaying to leptons [11]. The released data will enable users to develop an understanding
of a Higgs analysis. However, they will not be able to derive definitive statements about its existence or
properties due to the very limited statistics.
The standard object selection criteria (see Table 2) are applied. The event selection criteria are:
Single electron or muon trigger is satisfied;
Event in real data passes the Good Run List;
Event has a good vertex (Ntracks > 4);
Exactly two good leptons with pT > 25 GeV;
Leptons have opposite charge;
No jets with pT > 25 GeV;
If leptons have same flavour:

`` > 12 GeV;
mvis
| mvis
`` m Z |> 15 GeV;

ETmiss > 40 GeV;


Else:
m`` > 10 GeV;
ETmiss > 20 GeV;
pT,`` > 30 GeV;
(``, ETmiss ) > /2;
m`` < 55 GeV;
(leadlep, traillep) < 1.8 radians.
The overall normalisation of the selected H WW events looks reasonable. The results shown include the
ratio between the Higgs signal hypothesis and the total background represented by the stacked contributions
estimated via simulated data. The Higgs signal shape is drawn in front of the stack of Standard Model
backgrounds.
Given the low statistics no precise statements about the description of the real data by the simulation can
be made. All results shown exhibit good agreement between measured and simulated data. Figure 26
exhibits various event variables relevant for the selection of H WW events. None of them show any

33
apparent discrepancy. The lepton kinematics for the leading and trailing leptons are shown in Figure 27
and Figure 29. Additional isolation and tracking information is accessible via Figure 28 and Figure 30.
No significant disagreements are visible in any of the lepton histograms. The histograms depicting the
vertex information in Figure 31 show the expected disagreement between simulated and real data. The
pile-up treatment in simulated data considered the whole 2012 run period whilst the real data is taken only
from period D of the 2012 data taking.
Events

Events
ATLAS Open Data Diboson 35 ATLAS Open Data Diboson
35 DrellYan DrellYan
W W
30
30 Z Z
stop stop
ttbar 25 ttbar
25 Data Data
Higgs Higgs
20 20

15 15

10 10

5 5

0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
ET,Miss [GeV] Mvis
Data/MC

Data/MC
1.5 1.5 ll
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
EMiss
T [GeV] mvis
ll [GeV]
Events

Events

ATLAS Open Data Diboson 40 ATLAS Open Data Diboson


DrellYan DrellYan
30 W W
35
Z Z
25 stop
30
stop
ttbar ttbar
Data Data
20 Higgs 25 Higgs

20
15
15
10
10
5 5

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 20 40 60 80 100 120 140 160 180 200
| | p
Data/MC

Data/MC

1.5 ll 1.5 T,ll

1 1
0.5 0.5

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 20 40 60 80 100 120 140 160 180 200
| | p
T,ll
ll

Figure 26: H WW Analysis: Event variable histograms. The variables plotted from upper left to lower right are
the missing transverse momentum ETmiss , the visible mass of the H WW boson candidate mvis `` , the opening angle
in between the two selected leptons |`` |, and the transverse momentum of the dilepton system pT,`` .

34
Leptons
Leptons
ATLAS Open Data Diboson ATLAS Open Data Diboson
35 DrellYan 25 DrellYan
W W
30 Z Z
stop stop
ttbar 20 ttbar
25 Data Data
Higgs Higgs
20 15

15
10
10
5
5

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


pleadlep [GeV] leadlep
Data/MC

Data/MC
1.5 T
1.5
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


pleadlep [GeV] leadlep
T
Leptons

25
ATLAS Open Data Diboson
DrellYan
Leptons 22 ATLAS Open Data Diboson
DrellYan
W 20 W
Z Z
stop 18 stop
20 ttbar ttbar
Data
16 Data
Higgs 14 Higgs
15 12
10
10 8
6
5 4
2
3 2 1 0 1 2 3 0 50 100 150 200 250 300
leadlep leadlep

Data/MC

Data/MC

1.5 1.5 E [GeV]


1 1
0.5 0.5

3 2 1 0 1 2 3 0 50 100 150 200 250 300


leadlep leadlep
E [GeV]

90
Leptons

Leptons

50 ATLAS Open Data Diboson ATLAS Open Data Diboson


DrellYan 80 DrellYan
W W
40 Z Z
stop
70 stop
ttbar ttbar
Data 60 Data
30 Higgs
50
Higgs

40
20
30

10 20
10

0 5 10 15 20 25 30 1.5 1 0.5 0 0.5 1 1.5


leadlep leadlep
Data/MC

Data/MC

1.5 |PDG ID| 1.5 Q


1 1
0.5 0.5

0 5 10 15 20 25 30 1 0 1
leadlep leadlep
|PDG id| Q

Figure 27: H WW Analysis: Leading lepton properties. From upper left to lower right are shown: Transverse
momentum pT , pseudorapidity , azimuthal angle , energy E, absolute value of the PDG id |PDG id|, and charge
Q.

35
Leptons

Leptons
40 ATLAS Open Data Diboson 60 ATLAS Open Data Diboson
DrellYan DrellYan
35 W W
Z 50 Z
stop stop
30 ttbar ttbar
Data 40 Data
25 Higgs Higgs

20 30
15
20
10
10
5

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
leadlep leadlep
Data/MC

Data/MC

1.5 etconerel20 1.5 ptconerel30


1 1
0.5 0.5

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
leadlep leadlep
etconerel20 ptconerel30
Leptons

Leptons

22 ATLAS Open Data Diboson 40 ATLAS Open Data Diboson


DrellYan DrellYan
20 W 35 W
18 Z Z
stop stop
16 ttbar
30 ttbar
Data Data
14 Higgs 25 Higgs
12
20
10
8 15
6 10
4
5
2
1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
leadlep
zleadlep
Data/MC

Data/MC

1.5 0 [mm] 1.5 d0 [mm]


1 1
0.5 0.5

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
zleadlep
0 [mm] d0
leadlep
[mm]

Figure 28: H WW Analysis: Leading lepton isolation and tracking information. From upper left to lower right
are shown: relative transverse energy isolation (etconerel20), relative transverse momentum isolation (ptconerel30),
longitudinal impact parameter z0 , and transverse impact parameter d 0 .

36
Leptons
Leptons 20
ATLAS Open Data Diboson ATLAS Open Data Diboson
40 DrellYan DrellYan
W 18 W
35 Z
16 Z
stop stop
30 ttbar
14 ttbar
Data Data
Higgs 12 Higgs
25
20 10
8
15
6
10
4
5 2

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


ptraillep [GeV] traillep
Data/MC

Data/MC
1.5 T
1.5
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


ptraillep [GeV] traillep
T
Leptons

Leptons
30 ATLAS Open Data Diboson ATLAS Open Data Diboson
DrellYan DrellYan
W 25 W
25 Z Z
stop stop
ttbar 20 ttbar
20 Data Data
Higgs Higgs
15
15

10 10

5 5

3 2 1 0 1 2 3 0 50 100 150 200 250 300


traillep traillep

Data/MC

Data/MC

1.5 1.5 E [GeV]


1 1
0.5 0.5

3 2 1 0 1 2 3 0 50 100 150 200 250 300


traillep traillep
E [GeV]

90
Leptons

Leptons

45 ATLAS Open Data Diboson ATLAS Open Data Diboson


40 DrellYan 80 DrellYan
W W
Z Z
35 stop
70 stop
ttbar ttbar
30 Data 60 Data
Higgs Higgs
25 50
20 40
15 30
10 20
5 10

0 5 10 15 20 25 30 1.5 1 0.5 0 0.5 1 1.5


traillep traillep
Data/MC

Data/MC

1.5 |PDG ID| 1.5 Q


1 1
0.5 0.5

0 5 10 15 20 25 30 1 0 1
traillep traillep
|PDG id| Q

Figure 29: H WW Analysis: Trailing lepton properties. From upper left to lower right are shown: Transverse
momentum pT , pseudorapidity , azimuthal angle , energy E, absolute value of the PDG id |PDG id|, and charge
Q.

37
60
Leptons

Leptons
24
ATLAS Open Data Diboson ATLAS Open Data Diboson
22 DrellYan DrellYan
20 W 50 W
Z Z
18 stop stop
ttbar ttbar
16 Data
40 Data
14 Higgs Higgs

12 30
10
8 20
6
4 10
2
0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
traillep traillep
Data/MC

Data/MC

1.5 etconerel20 [GeV] 1.5 ptconerel30 [GeV]


1 1
0.5 0.5

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
traillep traillep
etconerel20 [GeV] ptconerel30 [GeV]
Leptons

Leptons

18 ATLAS Open Data Diboson 35 ATLAS Open Data Diboson


DrellYan DrellYan
16 W
30 W
Z Z
14 stop stop
ttbar 25 ttbar
12 Data Data
Higgs Higgs
10 20
8 15
6
10
4
2 5

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
traillep
ztraillep
Data/MC

Data/MC

1.5 0 [mm] 1.5 d0 [mm]


1 1
0.5 0.5

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
ztraillep
0 [mm] d0
traillep
[mm]

Figure 30: H WW Analysis: Trailing lepton isolation and tracking information. From upper left to lower right
are shown: relative transverse energy isolation (etconerel20), relative transverse momentum isolation (ptconerel30),
longitudinal impact parameter z0 , and transverse impact parameter d 0 .

38
22
Events

ATLAS Open Data Diboson Events ATLAS Open Data Diboson


16 DrellYan 20 DrellYan
W W
14 Z 18 Z
stop stop
16
12 ttbar ttbar
Data 14 Data
Higgs Higgs
10 12
8 10
6 8
6
4
4
2 2
0 5 10 15 20 25 200 150 100 50 0 50 100 150 200
Data/MC

Data/MC

1.5 Nvertex 1.5 zVertex [mm]


1 1
0.5 0.5

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Nvertex zVertex [mm]

Figure 31: H WW Analysis: Vertex histograms. The number of vertices Nvertex and the z coordinate of the
primary vertex zvertex are shown.

39
2.7 Z 0 Analysis

Searching for new physics beyond the Standard Model is a cornerstone of the ATLAS physics programme.
Making such searches available in an educational context exemplifies how they are carried out and the
important role of statistical analysis methods. Furthermore topics such as the sensitivity of a variable
towards new physics or selection optimisation and its dependence on a free theory parameter may be
discussed.
This analysis mimics a Z 0 t t analysis in the semileptonic top quark pair channel allowing electrons or
muons as lepton candidates ( Z 0 t t W bW b `bqqb). The standard object selection criteria (see
Table 2) are applied. The event selection criteria are:
Single electron or muon trigger is satisfied;
Event in real data passes the Good Run List;
Event has a good vertex (Ntracks > 4);
Exactly one good lepton with pT > 25 GeV;
At least four good jets;
At least one b-tagged jet (MV1@70%);
ETmiss > 30 GeV;

T + ET
mW miss > 60 GeV.

The figures in this section show the Standard Model backgrounds stacked on top of each other with the
signal shapes of two Z 0 mass hypotheses superimposed. The signal processes have been scaled by a factor
of 10 for better visibility. Data is shown as black circles.
The overall agreement between the data and simulated predictions is good. The kinematic description
of the leptons is depicted in Figure 32. Isolation information and tracking information is replicated
reasonably well in the regions with relevant contributions as can be seen in Figure 33. Figure 34 shows
the kinematics of the jets in the selected events as well as the MV1 b-tagging weights and the jet vertex
fraction. Overall, the kinematics are very well described. The jet vertex fraction and the MV1 weight are
more complex variables, but are well reproduced by the simulated data.
The histograms depicting the vertex information in Figure 35 show the expected disagreement between
simulated and real data. The pile-up treatment in simulated data considers the whole 2012 run period
whilst the real data is taken only from period D of the 2012 data taking. A slight slope is observed in the
data/simulation ratio for the missing transverse momentum (see Figure 36), which is likely to be caused
by either the non inclusion of QCD contributions or the non-optimal description of ETmiss in the Z and W
samples.

40
4000

Leptons
Leptons
ATLAS Open Data Diboson ATLAS Open Data Diboson
DrellYan 1600 DrellYan
3500 W W
Z 1400 Z
3000 stop stop
ttbar 1200 ttbar

2500 Data
ZPrime1000 x 10
Data

ZPrime500 x 10
1000 ZPrime1000 x 10
ZPrime500 x 10
2000
800
1500
600
1000 400
500 200

0 20 40 60 80 100 120 140 160lep180 200 3 2 1 0 1 2 3


lep
Data/MC

Data/MC
1.5 p [GeV] 1.5
T
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 3 2 1 0 1 2 3


plep [GeV] lep
T
Leptons

ATLAS Open Data Leptons 2000


ATLAS Open Data
1400 Diboson Diboson
DrellYan 1800 DrellYan
W W
1200 Z 1600 Z
stop stop

1000
ttbar 1400 ttbar
Data Data
ZPrime1000 x 10 1200 ZPrime1000 x 10

800 ZPrime500 x 10 ZPrime500 x 10


1000
600 800
400 600
400
200
200

3 2 1 0 1 2 3 0 50 100 150 200 250 300


lep lep

Data/MC

Data/MC

1.5 1.5 E [GeV]


1 1
0.5 0.5

3 2 1 0 1 2 3 0 50 100 150 200 250 300


lep lep
E [GeV]
103
Leptons

Leptons

ATLAS Open Data Diboson


22 ATLAS Open Data Diboson
9000 DrellYan 20 DrellYan
W W
8000 Z 18 Z
stop stop
7000 ttbar
16 ttbar

6000
Data
ZPrime1000 x 10
14 Data
ZPrime1000 x 10

5000
ZPrime500 x 10 12 ZPrime500 x 10

10
4000
8
3000
6
2000 4
1000 2
0 5 10 15 20 25 30 1.5 1 0.5 0 0.5 1 1.5
lep lep
Data/MC

Data/MC

1.5 |PDG ID| 1.5 Q


1 1
0.5 0.5

0 5 10 15 20 25 30 1 0 1
lep lep
|PDG id| Q

Figure 32: Z 0 Analysis: Leading lepton properties. From upper left to lower right are shown: Transverse momentum
pT , pseudorapidity , azimuthal angle , energy E, absolute value of the PDG id |PDG id|, and charge Q.

41
103
Leptons

Leptons
7000 ATLAS Open Data Diboson ATLAS Open Data Diboson
DrellYan 14 DrellYan

6000 W W
Z Z
stop
12 stop
5000 ttbar ttbar
Data 10 Data
ZPrime1000 x 10 ZPrime1000 x 10
4000 ZPrime500 x 10 ZPrime500 x 10
8
3000
6
2000 4
1000 2

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
lep lep
Data/MC

Data/MC

1.5 etconerel20 1.5 ptconerel30


1 1
0.5 0.5

0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.10.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
lep lep
etconerel20 ptconerel30

10 10
3
Leptons

Leptons

5000 ATLAS Open Data Diboson ATLAS Open Data Diboson


DrellYan DrellYan
W W

4000 Z 8 Z
stop stop
ttbar ttbar
Data Data

3000 ZPrime1000 x 10 6 ZPrime1000 x 10


ZPrime500 x 10 ZPrime500 x 10

2000 4

1000 2

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
lep
zlep
Data/MC

Data/MC

1.5 0 [mm] 1.5 d0 [mm]


1 1
0.5 0.5

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1
zlep
0 [mm]
lep
d0 [mm]

Figure 33: Z 0 Analysis: Leading lepton isolation and tracking information. From upper left to lower right are
shown: relative transverse energy isolation (etconerel20), relative transverse momentum isolation (ptconerel30),
longitudinal impact parameter z0 , and transverse impact parameter d 0 .

42
103 103
16
Events

Jets
ATLAS Open Data Diboson ATLAS Open Data Diboson
10 DrellYan
14 DrellYan
W W
Z Z

8 stop 12 stop
ttbar ttbar
Data
ZPrime1000 x 10
10 Data
ZPrime1000 x 10
6 ZPrime500 x 10 ZPrime500 x 10
8

4 6
4
2
2

0 1 2 3 4 5 6 7 8 9 0 20 40 60 80 100 120 140 160jet180 200


Njets
Data/MC

Data/MC
1.5 1.5 p [GeV]
T
1 1
0.5 0.5

0 1 2 3 4 5 6 7 8 9 0 20 40 60 80 100 120 140 160 180 200


Njets pjet [GeV]
T

103
Jets

ATLAS Open Data Jets ATLAS Open Data


Diboson
10 Diboson
7000 DrellYan DrellYan
W W
Z Z
6000 stop 8 stop
ttbar ttbar
5000 Data Data
ZPrime1000 x 10 ZPrime1000 x 10
ZPrime500 x 10
6 ZPrime500 x 10
4000
3000 4
2000
2
1000

3 2 1 0 1 2 3 0 2 4 6 8 10 12 14 16 18 20
jet mjet [GeV]
Data/MC

Data/MC

1.5 1.5
1 1
0.5 0.5

3 2 1 0 1 2 3 0 2 4 6 8 10 12 14 16 18 20
jet mjet [GeV]

103 103
60
Jets

Jets

40 ATLAS Open Data Diboson ATLAS Open Data Diboson


DrellYan DrellYan

35 W 50 W
Z Z

30 stop
ttbar
stop
ttbar
Data
40 Data
25 ZPrime1000 x 10 ZPrime1000 x 10
ZPrime500 x 10 ZPrime500 x 10

20 30

15 20
10
10
5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Data/MC

Data/MC

1.5 JVF 1.5 MV1 weight


1 1
0.5 0.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
JVF MV1 weight

Figure 34: Z 0 Analysis: Jet properties. From upper left to lower right are shown: Jet multiplicity Njets , transverse
momentum pT , pseudorapidity , mass m, jet vertex fraction (JVF), and MV1 b-tagging weight of the selected jets.

43
2000 1600
Events

Events
ATLAS Open Data Diboson ATLAS Open Data Diboson
1800 DrellYan
1400 DrellYan
W W
1600 Z Z
stop 1200 stop
1400 ttbar ttbar

1200
Data
ZPrime1000 x 10
1000 Data
ZPrime1000 x 10
ZPrime500 x 10 ZPrime500 x 10
1000 800
800 600
600
400
400
200 200

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Data/MC

Data/MC
1.5 Nvertex 1.5 zVertex [mm]
1 1
0.5 0.5

0 5 10 15 20 25 200 150 100 50 0 50 100 150 200


Nvertex zVertex [mm]

Figure 35: Z 0 Analysis: Vertex histograms. The number of vertices Nvertex and the z coordinate of the primary
vertex zvertex are shown.

3000
Events
Events

ATLAS Open Data Diboson ATLAS Open Data Diboson


DrellYan
1400 DrellYan

2500 W W
Z 1200 Z
stop stop

2000 ttbar
Data 1000
ttbar
Data
ZPrime1000 x 10 ZPrime1000 x 10

1500
ZPrime500 x 10 800 ZPrime500 x 10

600
1000
400
500
200

0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
ET,Miss [GeV] MT,W [GeV]
Data/MC

Data/MC

1.5 1.5
1 1
0.5 0.5

0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
EMiss
T [GeV] mW
T [GeV]

Figure 36: Z 0 Analysis: Event variable histograms. The variables plotted are missing transverse momentum ETmiss
and the transverse mass of the W boson candidate mW T .

44
3 Summary

The production of the ATLAS open data dataset and the tools accompanying it have been discussed.
The prepared bundle is released in accordance with the ATLAS Data Policy [1]. Results of a number
of example analyses inspired by actual analyses by ATLAS have been presented to demonstrate possible
applications of the ATLAS open data dataset. Overall, a good agreement has been observed and sources
of possible deviations have been identified and discussed.
It is believed that the ATLAS open data dataset and tools together with the documentation that will be
made available separately will provide an engaging learning environment for undergraduate students and
other interested audiences.

Acknowledgements

We thank CERN for the very successful operation of the LHC, as well as the support staff from our
institutions without whom ATLAS could not be operated efficiently.
We acknowledge the support of ANPCyT, Argentina; YerPhI, Armenia; ARC, Australia; BMWFW
and FWF, Austria; ANAS, Azerbaijan; SSTC, Belarus; CNPq and FAPESP, Brazil; NSERC, NRC and
CFI, Canada; CERN; CONICYT, Chile; CAS, MOST and NSFC, China; COLCIENCIAS, Colombia;
MSMT CR, MPO CR and VSC CR, Czech Republic; DNRF and DNSRC, Denmark; IN2P3-CNRS,
CEA-DSM/IRFU, France; GNSF, Georgia; BMBF, HGF, and MPG, Germany; GSRT, Greece; RGC,
Hong Kong SAR, China; ISF, I-CORE and Benoziyo Center, Israel; INFN, Italy; MEXT and JSPS,
Japan; CNRST, Morocco; FOM and NWO, Netherlands; RCN, Norway; MNiSW and NCN, Poland;
FCT, Portugal; MNE/IFA, Romania; MES of Russia and NRC KI, Russian Federation; JINR; MESTD,
Serbia; MSSR, Slovakia; ARRS and MIZ, Slovenia; DST/NRF, South Africa; MINECO, Spain; SRC
and Wallenberg Foundation, Sweden; SERI, SNSF and Cantons of Bern and Geneva, Switzerland; MOST,
Taiwan; TAEK, Turkey; STFC, United Kingdom; DOE and NSF, United States of America. In addition,
individual groups and members have received support from BCKDF, the Canada Council, CANARIE,
CRC, Compute Canada, FQRNT, and the Ontario Innovation Trust, Canada; EPLANET, ERC, FP7,
Horizon 2020 and Marie Skodowska-Curie Actions, European Union; Investissements dAvenir Labex
and Idex, ANR, Rgion Auvergne and Fondation Partager le Savoir, France; DFG and AvH Foundation,
Germany; Herakleitos, Thales and Aristeia programmes co-financed by EU-ESF and the Greek NSRF;
BSF, GIF and Minerva, Israel; BRF, Norway; Generalitat de Catalunya, Generalitat Valenciana, Spain;
the Royal Society and Leverhulme Trust, United Kingdom.
The crucial computing support from all WLCG partners is acknowledged gratefully, in particular from
CERN and the ATLAS Tier-1 facilities at TRIUMF (Canada), NDGF (Denmark, Norway, Sweden),
CC-IN2P3 (France), KIT/GridKA (Germany), INFN-CNAF (Italy), NL-T1 (Netherlands), PIC (Spain),
ASGC (Taiwan), RAL (UK) and BNL (USA) and in the Tier-2 facilities worldwide.

45
Appendix

branchname type description


runNumber int run number
eventNumber int event number
channelNumber int channel number
mcWeight float weight of an MC event
pvxp_n int number of primary vertices
vxp_z float z-position of the primary vertex
trigE bool boolean whether a standard trigger is satisfied in the egamma stream
trigM bool boolean whether a standard trigger is satisfied in the muon stream
passGRL bool signifies whether event passes the Good Run List and thus put in isGoodEvent
hasGoodVertex bool signifies whether the event has at least one good vertex
lep_n int number of preselected leptons
lep_truthMatched vector<bool> boolean indicating whether the lepton is matched to a truth lepton
lep_trigMatched vector<bool> boolean signifying whether the lepton is the one triggering the event
lep_pt vector<float> transverse momentum of the lepton
lep_eta vector<float> pseudorapidity of the lepton
lep_phi vector<float> azimuthal angle of the lepton
lep_E vector<float> energy of the lepton
lep_z0 vector<float> z-coordinate of the track associated to the lepton wrt. the primary vertex
lep_charge vector<float> charge of the lepton
lep_flag vector<int> bitmask implementing object cuts of the top group
lep_type vector<int> number signifying the lepton type (e, mu, tau) of the lepton
lep_ptcone30 vector<float> ptcone30 isolation for the lepton
lep_etcone20 vector<float> etcone20 isolation for the lepton
lep_trackd0pvunbiased vector<float> d0 of the track associated to the lepton at the point of closest approach (p.c.a.)
lep_tracksigd0pvunbiased vector<float> d0 significance of the track associated to the lepton at the p.c.a.
met_et float Transverse energy of the missing momentum vector
met_phi float Azimuthal angle of the missing momentum vector
jet_n int number of selected jets
jet_pt vector<float> transverse momentum of the jet
jet_eta vector<float> pseudorapidity of the jet
jet_phi vector<float> azimuthal angle of the jet
jet_E vector<float> energy of the jet
jet_m vector<float> invariant mass of the jet
jet_jvf vector<float> JetVertexFraction of the jet
jet_trueflav vector<int> true flavor of the jet
jet_truthMatched vector<int> information whether the jet matches a jet on truth level
jet_SV0 vector<float> SV0 weight of the jet
jet_MV1 vector<float> MV1 weight of the jet
scaleFactor_BTAG float scalefactor for btagging
scaleFactor_ELE float scalefactor for electron efficiency
scaleFactor_JVFSF float scalefactor for jet vertex fraction
scaleFactor_MUON float scalefactor for muon efficiency
scaleFactor_PILEUP float scalefactor for pileup reweighting
scaleFactor_TRIGGER float scalefactor for trigger
scaleFactor_ZVERTEX float scalefactor for z-vertex reweighting

Table 3: Branches of the tuples for the ATLAS data release. The content of these tuples was defined by the ATLAS
top analysis group. The technical implementation used was AnalysisTop-1.9.1. Superfluous branches not needed
for the educational purposes of the data release were dropped.

46
L [pb1 ]
preselectd total
period NEvents NEvents size/Mb

Egamma 7917590 33575219 1000.6 723


Muons 7028084 33815203 1000.6 600

Table 4: Breakdown of the sample of real data with a total integrated luminosity of 1 fb1 . NEvents
preselected
denotes the
total the number of events prior to the preselection, L the luminosity of the
number of events after preselection, NEvents
sample and size after preselection. The sample is made by combining the runs 207490, 207532, 207582, 207589,
207749, 207772, 207845, 207865, 207934, 207982, 208126, 208184, 208189, and 208258. The real data is selected
using the same preselection as applied on the simulated data.

47
reduced preselected
process DSID Generator *FE [pb] fk L [fb1 ] NEvents NEvents size/Mb

tt l + X 117050 Powheg +Pythia 114.51 1.2 26.236 1500000 20775908 291


tt Jets 117049 Powheg +Pythia 96.35 1.2 85.027 25170 25170 5.7
single top t-chan top 110090 Powheg +Pythia 17.52 1.05 24.21 150000 1678087 21
single top t-chan antitop 110091 Powheg +Pythia 9.4 1.06 43.23 150000 1719075 15
single top s-chan 110119 Powheg +Pythia 1.64 1.107 167.73 100000 1966242 15
single top Wt-chan 110140 Powheg +Pythia 20.46 1.09 28.50 150000 235557 26
Z+Jets ee 147770 Sherpa 1207.4 1.028 10.08 7500000 49405819 938
Z+Jets mumu 147771 Sherpa 1207.4 1.028 9.63 7500000 60149707 918
Z+Jets tautau 147772 Sherpa 1207.1 1.028 11.08 750000 814528 93
Drell-Yan ee M08to15 173041 Sherpa 92.15 1.0 45.95 400000 447800 57
Drell-Yan ee M15to40 173042 Sherpa 279.19 1.0 47.22 750000 793055 100
Drell-Yan mumu M08to15 173043 Sherpa 92.08 1.0 51.93 500000 520562 74
Drell-Yan mumu M15to40 173044 Sherpa 279.2 1.0 41.01 750000 750246 103
Drell-Yan tautau M08to15 173045 Sherpa 92.12 1.0 27.13 9993 9993 1.5
Drell-Yan tautau M15to40 173046 Sherpa 279.11 1.0 49.54 32393 32393 4.5

48
W+Jets enu with b 167740 Sherpa 140.34 1.1 12.333 750000 5792095 86
W+Jets enu with jets, bveto 167741 Sherpa 537.84 1.1 9.563 2600000 2648506 296
W+Jets enu no jets, bveto 167742 Sherpa 10295 1.1 1.971 8000000 8448069 722
W+Jets munu with b 167743 Sherpa 140.39 1.1 11.935 750000 5630683 84
W+Jets munu with jets, bveto 167744 Sherpa 466.47 1.1 10.582 2500000 2759594 287
W+Jets munu no jets, bveto 167745 Sherpa 10368 1.1 1.719 7500000 7946599 666
W+Jets taunu with b 167746 Sherpa 140.34 1.1 18.245 100000 531981 13
W+Jets taunu with jets, bveto 167747 Sherpa 506.45 1.1 9.821 250000 273867 31
W+Jets taunu no jets, bveto 167748 Sherpa 10327 1.1 1.945 550000 593205 55
WW 105985 Herwig 12.42 1.683 46.32 500000 1288259 63
ZZ 105986 Herwig 0.992 1.55 151.19 125000 131435 20
WZ 105987 Herwig 3.667 1.9 138.44 500000 517196 68

Table 5: Samples for simulated data of the ATLAS open data dataset describing Standard Model processes. The individual processes are derived using the
simulated datasets with the given dataset id (DSID). Cross sections combined with filter efficiencies are given with the appropriate scaling factors f k for higher
preselected
order QCD corrections where available. After being subjected to a preselection NEvents are available in the samples. A reduction procedure is applied in
order to decrease the processing time and storage requirements which further reduces the number of events found in the samples. Resulting event yields after
reduced and L, respectively.
preselection and reduction and the luminosity of these samples are denoted as NEvents
reduced preselected
process DSID Generator *FE [pb] fk L [fb1 ] NEvents NEvents size/Mb

Z 0 t t MZ0 = 400 GeV 110899 Pythia 4.259 1.0 23.48 21941 21941 4.3
Z 0 t t MZ0 = 500 GeV 110901 Pythia 3.925 1.0 25.48 23231 23231 4.7
Z 0 t t MZ0 = 750 GeV 110902 Pythia 1.243 1.0 80.45 25021 25021 5.3
Z 0 t t MZ0 = 1000 GeV 110903 Pythia 0.394 1.0 253.81 25525 25525 5.5
Z 0 t t MZ0 = 1250 GeV 110904 Pythia 0.139 1.0 719.43 25030 25030 5.5
Z 0 t t MZ0 = 1500 GeV 110905 Pythia 0.0524 1.0 1908 24142 24142 5.4
Z 0 t t MZ0 = 1750 GeV 110906 Pythia 0.0211 1.0 4739 23084 23084 5.1
Z 0 t t MZ0 = 2000 GeV 110907 Pythia 0.00894 1.0 11186 21997 21997 4.9
Z 0 t t MZ0 = 2250 GeV 110908 Pythia 0.00394 1.0 25381 21127 21127 4.7

49
Z 0 t t MZ0 = 2500 GeV 110909 Pythia 0.00180 1.0 55556 20327 20327 4.5
Z 0 t t MZ0 = 3000 GeV 110910 Pythia 0.000434 1.0 230415 19646 19646 4.3
gg H WW ll MH = 125 GeV 161005 Powheg +Pythia 6.463 1.0 32.13 100000 278332 14
VBFH WW ll MH = 125 GeV 161055 Powheg +Pythia 0.819 1.0 229.93 100000 183101 18
gg H Z Z 4l MH = 125 GeV 160155 Powheg +Pythia 13.17 1.0 14.31 100000 117081 15
VBFH Z Z 4l MH = 125 GeV 160205 Powheg +Pythia 1.617 1.0 104.96 100000 130213 19

Table 6: Samples for simulated data of the ATLAS open data dataset describing Beyond the Standard Model signals and Higgs physics. The individual processes
are derived using the simulated datasets with the given dataset id (DSID). Cross sections combined with filter efficiencies are given with the appropriate scaling
preselected
factors f k for higher order QCD corrections where available. After being subjected to a preselection NEvents are available in the samples. A reduction
procedure is applied in order to decrease the processing time and storage requirements which further reduces the number of events found in the samples. Resulting
reduced and L, respectively.
event yields after preselection and reduction and the luminosity of these samples are denoted as NEvents
References

[1] ATLAS Collaboration, ATLAS Data Access Policy, (2014).


[2] Kaggle Higgs Boson Machine Learning Challenge,
https://www.kaggle.com/c/higgs-boson, Accessed: 2016-06-16.
[3] R. Brun and F. Rademakers, ROOT: An object oriented data analysis framework,
Nucl. Instrum. Meth. A389 (1997) 81.
[4] ATLAS Masterclasses W-path,
http://atlas.physicsmasterclasses.org/en/wpath.htm, Accessed: 2016-06-14.
[5] ATLAS Collaboration, Measurement of the W charge asymmetry in the W decay mode in

pp collisions at s = 7 TeV with the ATLAS detector, Phys. Lett. B701 (2011) 31,
arXiv: 1103.2929 [hep-ex].
[6] G. Aad et al., Measurement of the W ` and Z/ `` production cross sections in

proton-proton collisions at s = 7 TeV with the ATLAS detector, JHEP 12 (2010) 060,
arXiv: 1010.2130 [hep-ex].
[7] K. A. Olive et al., Review of Particle Physics, Chin. Phys. C38 (2014) 090001.
[8] ATLAS Masterclasses Z-path,
http://atlas.physicsmasterclasses.org/en/zpath.htm, Accessed: 2016-06-14.
[9] ATLAS Collaboration, Measurements of W Z production cross sections in pp collisions at

s = 8 TeV with the ATLAS detector and limits on anomalous gauge boson self-couplings,
Phys. Rev. D93 (2016) 092004, arXiv: 1603.02151 [hep-ex].
[10] ATLAS Collaboration, Measurement of the Z Z production cross section and limits on anomalous

neutral triple gauge couplings in proton-proton collisions at s = 7 TeV with the ATLAS detector,
Phys. Rev. Lett. 108 (2012) 041804, arXiv: 1110.5016 [hep-ex].
[11] ATLAS Collaboration, Search for the Standard Model Higgs boson in the H WW ``
decay mode using 1.7 fb-1 of data collected with the ATLAS detector at sqrt(s)=7 TeV,
tech. rep. ATLAS-CONF-2011-134, CERN, 2011,
url: http://cds.cern.ch/record/1383837.

50

You might also like