High Level Analysis Package

We have created a high level analysis package "HLAnalysis" for the needs of our ATLAS group here at Oxford. This is supposed to provide the ATLAS analyser with the tools to carry out a high level analysis on their local cluster/laptop decoupled from Athena. The basic idea is that the HLAnalysis package acts as an interface, so that any changes on Athena side are caught in the HLA package, whereas your analysis code stays inaffected. Its idea is similar to the UserAnalysisEvent package:
  • we write out objects to the HLAnalysis AANT, e.g. electrons, jets, TPObjects (~TrackParticles), MEt, etc.
  • each object comes with an interface, so that you can do something like electron->pt(), electron->isEM(), etc.
BUT:
  • our analysis package is supposed to be compilable on any Linux machine with gcc 3.2.3 onwards and a ROOT version 5 installation
  • we tried to keep it as simple as possible, which means:
    • fewer chances for bugs, and a vast compatibility
    • a reduction in size of the written-out AANT's by a factor of up to ~10.

We have writtten the package for Athena release 11.0.42, but it should work with any version 11 release. There are adaptations needed for version 12, which are envisioned.

Technical Key Points:

More technically speaking, the HLAnalysis package consists of two parts: the Athena side, which is used to produce HLAnalysis AANT's, and your local HLAnalysis installation, which provides you with the library you need to read back the class structure of the objects stored on the HLAnalysis AANT. There are already several HLAnalysis files files which have been produced from 11.0.42 AOD's for WZ, ZZ, Z->ee+jet, Z->mumu+jet, ttbar, SU2 and SU3 (located at /castor/cern.ch/user/o/obrandt/HLA3), so please feel free to try your hands on these.

HLAnalysis package: Athena side:

As already mentioned, we follow the philosophy of the UserAnalysisEvent package, i.e. we have introduced our custom classes like ElectronObject, OxMuon , JetObject, MCParticle, TPObject (~TrackParticle), MissingET, which are written to a HLAnalysis AANT. Similar to their "normal" pendants used in Athena, they inherit from other classes, like P4PxPyPz. The class structure is documented in

http://www-pnp.physics.ox.ac.uk/~obrandt/HLAnalysis_DOCUMENTATION/

In order for this to work, we have introduced a new package for Athena, the HLAnalysis package, which basically contains our new classes. Compiled, it gives a header file which must be included in the UserAnalysis package. Further, we have written filling algorithms for the UserAnalysis package, which are invoked to fill the HLAnalysis class containers when reading data from AOD's. Moreover, we have introduced a possiblity to filter events on MC truth level, for details see the the paragraph "Installation of the Athena part of the HLAnalysis package" below. We also had to modify the ParticleEvent package, in order to add some getter functions to "normal" Athena classes for links to TrackParticles. Thus, in HLAnalysis AANT's for each electron/muon the corresponding TrackParticles can be identified by an unique integer index member. All three packages, HLAnalysis, ParticleEvent and UserAnalysis are in CVS at lxplus.

For more technical details, please have a look at the source code, as well as the directions on the website

AddingObjectsToAAN

For installation documentation see paragraph "Installation of the Athena part of the HLAnalysis package" below. The whole installation procedure on Athena side is ~1/2 hr, given mostly by compilation time.

HLAnalysis package: local side:

At our local cluster in Oxford, I have started a CVS repository at /userdisk3/brandt/CVS_HLA, which can be used to check out our HLAnalysis package, and I have a couple of HLA AANT's to look into at /data/atlas/atlasdata3/obrandt/HLA. The files are the same as above on CASTOR, additionally we have merged files, i.e. files with full statistics (~50k) per physics process produced from sub-files by merging. This package can be compiled to give you the libDict.so library, which you need to load to read back our HLA AANT's with the proper class structure. Moreover, there is a basic analysis skeleton which can be compiled and which uses libDict.so to read back our custom AANT's. We (i.e. mostly Seshadri, our M.Phys. student) have started to work on the selection for SU2, and you can have a look into the code to see how we place some basic cuts on the leptons, MEt, etc. The main advantage is that you can do sth. like electron->pt(), ->isEM() etc.

There is documentation available on our custom HLA classes like Electron, Muon, TPObject, Jet etc., it can be found in the doc_html directory of the package or on

http://www-pnp.physics.ox.ac.uk/~obrandt/HLAnalysis_DOCUMENTATION/

For installation documentation see next 2 paragraphs. The installation on the Oxford cluster should not take more than 1/4-1/2 hour.

HLAnalysis installation at Oxford (on ppslgen)

To use the package here at Oxford (ppslgen), you need to (for bash):
    export CVSROOT=/userdisk3/brandt/CVS_HLA
    cvs co -P HLAnalysis
This will install the HLAnalysis package in the 'HLAnalysis' directory. On its top level you will find the analysis skeleton, which consists of cut_flow.hpp, cut_flow.cpp, histos.hpp and a Makefile, and, most important, instructions in the instructionsREADME file. In the include directory the libDict.so library can be compiled with a separate Makefile, which will enable you to read back classes from HLAnalysis AANT's. Following the instructions in instructionsREADME, you will need to:
    cd include
    make
    cd ..
    make
The first step compiles the libDict.so library, the second compiles the analysis skeleton to produce the executable run_cut_flow, which places some basic cuts and saves some histograms in the output directory. Since the environment is the same for everyone on ppslgen, I have left the library and the binaries in the package, so that there should not be any need to recompile. Therefore you might need to issue the make clean command to force recompiling.

Now you are ready to go! To see the interface of the run_cut_flow executable, do:

    ./run_cut_flow -h
By default it will run on a small SU2 file in the 'input' directory, which contains some 0.3k events. This file comes with your installation for testing purposes. You will find more HLA AANT's in
    /data/atlas/atlasdata3/obrandt/HLA

Running the executable will produce some numeric cut output on the screen and a root file with histograms in the output directory. To see the technicalities, please have a detailed look into the cut_flow.cpp file, or at the package documentation.

Caveat: if, despite having produced libDict.so, you get an error message "could not load libDict.so when running the executable" or similar, you should check your LD_LIBRARY_PATH setting:

bash-2.05b$ echo $LD_LIBRARY_PATH
:/system/cern/ROOT/v5-12-00e/lib/root:/system/cern/CLHEP/pro/lib
it should start with a colon (:), which means that your executables will look for libraries in the same folder, i.e. ":" is equivalent to "./.:".

Running HLAnalysis in ROOT interactive mode

After compiling the library as above, you can use the HLApackage in ROOT interactive mode by doing e.g.:
root [0] Cintex::Enable() ;
root [1] gSystem->Load("libDict.so") ;
root [2] TFile *f = new TFile("input/SU2_FilterV2_1000evt.aan_01.root") ;
Warning in <TClass::TClass>: no dictionary for class AttributeListLayout is available
Warning in <TClass::TClass>: no dictionary for class pair<string,string> is available
root [3] TTree *AANtuple=CollectionTree;
root [4] TBranch *branch_e = AANtuple->GetBranch("Electrons");
root [5] std::vector<User::ElectronObject>  *E_vec = 0;
root [6] branch_e->SetAddress(&E_vec);
root [7] int nentries = branch_e->GetEntries();
root [8] cout << nentries
317(class ostream)68884352
root [9] branch_e->GetEntry(7);
root [10] E_vec->size()
(const unsigned int)1
root [13] User::ElectronObject electron = (*E_vec)[0]
root [14] electron.pt()
(const double)1.01952299123916305e+03
root [15] electron.px()
(const double)5.57247834155363222e+01
root [16] electron.py()
(const double)1.01799895784747343e+03
root [17] sqrt( electron.px()*electron.px()+electron.py()*electron.py() )
(const double)1.01952299123916305e+03
root [18] electron.isEM()
(int)(-1)
root [21] electron.author()
(const unsigned int)2
root [22] electron.hasTrack()
(const bool)1
root [23] electron.tracklink()
(const int)24
An old version of our analysis (at that time still in ROOT interactive mode) can be found in the cut_flow.C file. Try this out by doing:
root [3] .x cut_flow.C
 Ntuple contains 317 entries.
NSel1ElEvents  : 47
NSel2ElEvents  : 5
NSel3ElEvents  : 1
NSel1MuEvents  : 92
NSel2MuEvents  : 50
NSel3MuEvents  : 26
NSel3LepEvents : 29
N1ElPt         : 31
N2ElPt         : 3
N3ElPt         : 0
N1MuPt         : 58
N2MuPt         : 19
N3MuPt         : 7
NLepPt         : 7
NJetVeto       : 1
NEtMiss        : 1
root [4]
Additionally, quite a few histograms will come flashing to the screen.

HLAnalysis installation at any site

You should be able to compile and run the code without any major dificulties if your machine has a gcc compiler 3.2.3 onwards, and if you have /afs/cern.ch mounted. If your institute's cluster fulfills these requirements, it is an easy game, and after extracting the package you simply follow the steps described above for Oxford.

You can download the current version of the HLAnalysis package for local analysis as HLAnalysis.tar.gz from http://www-pnp.physics.ox.ac.uk/~obrandt/HLAnalysis/

The requirement to have /afs/cern.ch mounted is due to the fact, that when generating the libDict.so library, the same mechanism as in Athena is used, i.e. the selection.xml file in the src directory of the package is parsed with the genreflex.py script. At our departmental cluster, the parser genreflex.py resides in the $ROOTSYS/lib/root/python/genreflex directory. The crucial point is, that this Python script invokes the gccxml compiler, which is rarely installed. But you can use the gccxml compiler at /afs/cern.ch/... (this is foreseen in the genreflex.py file), and this is where the importance of /afs/cern.ch comes in. If your institute's cluster does not have any /afs directories mounted, you might try your luck with the original HLAnalysisDict_rflx.cpp file (the one which is generated by genreflex.py). I.e. when generating the library, do not make clean in the include directory, bur rather delete libDict.so by hand (watch out, libDict.so is not in the include directory but 1 level above) and try a make. If your ROOT version is not crucially different form ours (5.12/00e), this should work.

Installation of the Athena part of the HLAnalysis package

In order to run the Athena part of the HLAnalysis package, one needs Athena release 11.0.42 (in principle, quite a few version 11 releases should be compatible). Therefore, start a new Athena distribution as described in the WorkBook, and, inside of your installation directory, do:
[lxplus100.cern.ch]> pwd
/afs/cern.ch/user/o/obrandt/scratch0/testarea2/11.0.42
[lxplus100.cern.ch]> mkdir PhysicsAnalysis
[lxplus100.cern.ch]> cd PhysicsAnalysis/
[lxplus100.cern.ch]> mkdir AnalysisCommon
[lxplus100.cern.ch]> cd AnalysisCommon/
[lxplus210.cern.ch]> cvs co -d HLAnalysis users/obrandt/HLAnalysis
[lxplus210.cern.ch]> cvs co -d UserAnalysis users/obrandt/UserAnalysis
[lxplus210.cern.ch]> cvs co -d ParticleEvent users/obrandt/ParticleEvent
[lxplus210.cern.ch]> cd UserAnalysis/UserAnalysis-00-05-11/cmt/
[lxplus210.cern.ch]> cmt broadcast cmt config
[lxplus210.cern.ch]> cmt broadcast source setup.sh
[lxplus210.cern.ch]> cmt broadcast make
Technically speaking, this produces the Athena HLAnalysis library and athena job options, which are then included in the run script. This enables Athena to write out classes to root files. If you change the HLAnalysis package, you need to take its header and source files and copy them into the corresponding directories of your local HLAnalysis package, recompile, et voila! But do not forget to adjust the filling algorithms in the UserAnalysis package. In the run directory, you will find test job options WZ_test.py, which should run. Do not worry about output like:
...
ApplicationMgr       INFO Application Manager Initialized successfully
Error: class,struct,union or type User not defined  FILE: LINE:0
Error: class,struct,union or type User not defined  FILE: LINE:0
RootClassLoader: level[Info] Failed to load dictionary for native class: "pair<int,HepMC::GenParticle*>"
RootClassLoader: level[Info] Failed to load dictionary for native class: "pair<int,HepMC::GenParticle*>"
RootClassLoader: level[Info] Failed to load dictionary for native class: "pair<int,int>"
RootClassLoader: level[Info] Failed to load dictionary for native class: "pair<int,HepMC::GenVertex*>"
RootClassLoader: level[Info] Failed to load dictionary for native class: "pair<int,HepMC::GenVertex*>"
Error in <TBranchElement::Fill>: attempt to fill branch ConeJets while addresss is not set
Error in <TBranchElement::Fill>: attempt to fill branch MissingEt while addresss is not set
Error in <TBranchElement::Fill>: attempt to fill branch MCEventInfo while addresss is not set
Error in <TBranchElement::Fill>: attempt to fill branch MCParticles while addresss is not set
Error in <TBranchElement::Fill>: attempt to fill branch Photons while addresss is not set
Error in <TBranchElement::Fill>: attempt to fill branch Electrons while addresss is not set
Error in <TBranchElement::Fill>: attempt to fill branch TPcandidates while addresss is not set
Error in <TBranchElement::Fill>: attempt to fill branch StacoMuons while addresss is not set
Error in <TBranchElement::Fill>: attempt to fill branch MuidMuons while addresss is not set
...
For each job option file you run (not each AOD), you will be missing exactly 1 event -- the first one. There shall be a fix for release 12.

The most amazing thing about the job options is that you can have filtering on MC truth level. After setting the switch FilterSkeleton.DoTruthFilter to True, only events with a chargino/next-to-lightest neutralino coming from the primary vertex will be written to the HLA AANT. Clearly, you can implement any criteria you like in the FilterSkeleton.cxx/.h file.

For more technical details on how the HLAnalysis package was implemented, please have a look at the code. Our basic strategy was along the outline for the UserAnalysisEvent package on the website by Ketevi:

AddingObjectsToAAN

HLA AANT Files to be analysed

These can be found on Castor in
   /castor/cern.ch/user/o/obrandt/HLA3
or on our local cluster at Oxford in
   /data/atlas/atlasdata3/obrandt/HLA
Caveat: due to a bug on Athena side, the first event in a job cannot be written out to the HLA AANT (which is surely a problem, but there shall be a fix for release 12). However, the run and event number branches (which are simple Int_t 's) are the only ones which contain entries from all the events (i.e. also the "skipped" ones). Therefore, when merging HLA AANT files to a single one, ROOT will believe there is 1 event more than actually available in each of the subfiles. Due to this you will get duplicated events at the end of each subfile. As a workaround, we remove these duplicated events with a simple check. These 10 lines of code can be found in the cut_flow.cpp file.

Documentation

A (commented) Doxygen documentation on the basic objects used in the package can be found in

Good luck with your analysis!

Authors

The HLAnalysis package was developed by:

  • P. Bruckman de Renstrom
  • M. Fiascaris
  • G. Kirsch
  • K. Lohwasser
  • O. Brandt
The analysis code was mostly written by S. Nadathur.

-- OlegBrandt - 26 Mar 2007

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2007-06-13 - OlegBrandt
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback