How to Skim/Slim a D3PD
The following is based on the Twiki:
https://twiki.cern.ch/twiki/bin/view/Main/NTUPtoNTUP
1. Prepare the workarea and checkout the files
mkdir -p $HOME/SKIM
cd $HOME/SKIM
asetup AtlasPhysics,17.2.5.2.3,here
cmt co -r NTUPtoNTUPCore-00-00-06 PhysicsAnalysis/NTUPtoNTUP/NTUPtoNTUPCore
cmt co -r NTUPtoNTUPExample-00-00-07 PhysicsAnalysis/NTUPtoNTUP/NTUPtoNTUPExample
2. Copy the skimming python code
Copy the
skim.py file in the directory:
$HOME/SKIM/PhysicsAnalysis/NTUPtoNTUP/NTUPtoNTUPExample/python
3. Compile
cd $HOME/SKIM/PhysicsAnalysis/NTUPtoNTUP/NTUPtoNTUPCore/cmt
make
cd $HOME/SKIM/PhysicsAnalysis/NTUPtoNTUP/NTUPtoNTUPExample/cmt
make
Have a look in the skim.py file to see which are the variables that will be kept and what is the filtering at the lepton level: at least 2 leptons above 5
GeV pt
4. Run some tests
LOCALY
TEST with:
SkimNTUP_trf.py inputNTUP_SMWZFile=/afs/cern.ch/work/k/kbachas/public/mc12_WZ_cutFlowFiles/NTUP_SMWZ.00855477._000014.root outputNTUP_MYSKIMNTUPFile=myTestNtup.root
Note that the trf calls
NTUPtoNTUPCore/share/SkimNTUP_topOptions.py
which in turn calls
NTUPtoNTUPExample/share/MySkimNTUP_prodJobOFragment
GRID TEST with:
source /afs/cern.ch/atlas/offline/external/GRID/DA/panda-client/latest/etc/panda/panda_setup.sh
export PATHENA_GRID_SETUP_SH=/afs/cern.ch/project/gd/LCG-share/current_3.2/etc/profile.d/grid-env.sh
pathena --trf "SkimNTUP_trf.py inputNTUP_SMWZFile=%IN outputNTUP_MYSKIMNTUPFile=%OUT.mySkimNtup.root" --inDS=mc12_8TeV.129477.PowhegPythia8_AU2CT10_WZ_Wm11Z11_mll0p250d0_2LeptonFilter5.merge.NTUP_SMWZ.e1300_s1469_s1470_r3542_r3549_p1328/ --outDS=user.kbachas.mc12_8TeV.129477.PowhegPythia8_AU2CT10_WZ_Wm11Z11_mll0p250d0.e1300_s1469_s1470_r3542_r3549_p1328/ --nFiles=2 --nFilesPerJob=1
Moving/sharing your SFrame setup
If you want to copy your sframe setup to share it with your colleagues and make it work, follow these instructions. It is useful in order to avoid repeating numerous hacks in
AnalysisBase and other inconsistencies with the svn trunk version. You can just create a tarball with your testarea/ElectroweakBosons folder and share this file.
1. Copy tarball and decompress:
tar -xzvf sframe.tgz
cd ElectroweakBosons
2. Remove recursively .svn folders, optionally remove also .__afs folders (they are left-overs from afs client, sometimes big in size)
find . -name .svn -exec rm -rf {} +
find . -name .__afs* -exec rm -rf {} +
3. Setup environment
bash
source scripts/setup_lxplus.sh
4. Clean-up
cd RootCore/RootCore
./configure
cd ../..
source scripts/setup_common.sh
cd RootCore
$ROOTCOREDIR/scripts/find_packages.sh
cd ..
make clean
make distclean
5. Compile everything
make
How to run Proof-on-Demand (PoD)
https://indico.cern.ch/getFile.py/access?contribId=1&resId=0&materialId=slides&confId=217328
To install pod (only first time):
wget http://pod.gsi.de/releases/pod/3.10/PoD-3.10-Source.tar.gz
tar -xzvf PoD-3.10-Source.tar.gz
source scripts/setup_lxplus.sh (from Sframe)
cd PoD-3.10-Source
mkdir build
cd build
cmake ../BuildSetup.cmake ..
make -j4 install
source scripts/setup_pod_lxplus.sh (from Sframe)
Change in $HOME/.PoD/PoD.cfg:
work_dir=/tmp/$USER/PoD
Create $HOME/.PoD/user_worker_env.sh:
echo "Setting user environment for workers ..."
source /afs/cern.ch/sw/lcg/contrib/gcc/4.3/x86_64-slc5-gcc43-opt/setup.sh
export LD_LIBRARY_PATH=/afs/cern.ch/sw/lcg/external/qt/4.4.2/x86_64-slc5-gcc43-opt/lib:/afs/cern.ch/sw/lcg/external/Boost/1.47.0_python2.6/x86_64-slc5-gcc43-opt//lib:/afs/cern.ch/sw/lcg/external/xroot
d/3.1.0p2/x86_64-slc5-gcc43-opt/lib64:/afs/cern.ch/sw/lcg/app/releases/ROOT/5.32.02/x86_64-slc5-gcc43-opt/root/lib:/afs/cern.ch/sw/lcg/external/Python/2.6.2/x86_64-slc5-gcc43-opt/lib:/afs/cern.ch/sw/l
cg/contrib/gcc/4.3.5/x86_64-slc5-gcc34-opt/lib64:/afs/cern.ch/sw/lcg/contrib/mpfr/2.3.1/x86_64-slc5-gcc34-opt/lib:/afs/cern.ch/sw/lcg/contrib/gmp/4.2.2/x86_64-slc5-gcc34-opt/lib
To run
PoD:
source scripts/setup_lxplus.sh
source scripts/setup_pod_lxplus.sh
pod-server start
In your xml config file, use:
RunMode="PROOF"
ProofServer="pod://"
Create a a worker cluster on lxbatch, using N cores, and check if they are ready to use:
pod-submit -r lsf -q 1nd -n N
pod-info -n
Now submit your job with 'sframe_main ...'.
Avoid compilation on PoD nodes
It is EXTREMELY time-saving to manipulate a handful of scripts in order to avoid compilation on each and every one of the worker nodes.
The procedure is described in the presentation by Max Bellomo, named tutorial.pdf (see attachments)
Tip: After making all the changes described above, under your main SFrame directory, issue:
make clean; make distclean; make; cd AnalysisXY; make clean; make distclean; make;
and then restart the
PoD server (if it was running) in order to propagate all changes imposed.
How to run SFrame on the GRID
Update your 'grid/' folder to this:
https://svnweb.cern.ch/trac/atlasinst/browser/Institutes/CERN/ElectroweakBosons/trunk/grid/
in order to use the existing scripts,
grid_build.sh (to build the job on the grid),
grid_run.sh (to run sframe on the grid),
prun.sh (to submit the job).
Follow the next steps.
1) Create a configuration file for your grid job, say
grid_analysiszz.xml, by merging your standard analysis configuration file and
CycleConfig.xml (see attached file as an example). You should remove unnecessary entities and make sure these lines are present in your merged file:
<!ENTITY grid SYSTEM "input_grid.xml">
...
&grid;
(This file, input_grid.xml, is created automatically when running grid_run.sh on the grid and will include the correct file for each subjob.).
Make sure that all configuration files are in your
ElectroweakBosons path and are defined properly in your configuration file. For example, these will work on the grid:
<Item Name="JetAFIICalibconfigFile" Value=":exp:$EWPATH/RootCore/ApplyJetCalibration/data/CalibrationConfigs/Rel17_JES_AFII.config"/>
<Item Name="JESconfigFile" Value="JES_2012/Moriond2013/InsituJES2012_20NP_ByCategory.config"/>
while this will NOT work:
<Item Name="PileupDataFileName" Value="/afs/cern.ch/atlas/project/physics/Dibosons/corrections/pileup/ilumicalc_histograms_None_200842-215643_v4.root"/>
Therefore, you should create a folder in your
ElectroweakBosons path, say
ExtRootFiles, and copy all needed files there. Make sure you include these files when submitting your job with prun.sh (see below).
2) In
grid_build.sh AND
grid_run.sh, remove any gcc or root configuration and add at the beginning:
setupATLAS
localSetupGcc
localSetupROOT --skipConfirm
In
grid_run.sh, add a statement to call your new configuration file:
elif [ "$3" == "AnalysisZZ" ] ; then
echo "Doing ZZ Analysis ..."
sframe_main grid/grid_analysiszz.xml
3) Edit
prun.sh to submit your job to the grid:
SKIM="AnalysisZZ"
...
VERSION="yourVersion" ---> Just a string to be inserted in the output container name
CONTAINER="" ---> Add "/" if you are running on a dataset container (the usual case)
RELEASE=17.2.7.4.1 ---> Doesn't matter, will be configured later
CMTCONFIG=x86_64-slc5-gcc43-opt ---> Same here...
LIST_USERSHARING=(
# "kbachas"
# "mbellomo"
# "vkousk"
# "mschott"
#"dkyriazo"
"iliadis" ---> Comment-out all other users and add your Grid name here
)
DATAlists=(
"finalTest.txt" ---> Define here the input: Inside directory grid, place the relevant .txt file with the datasets to run on.
)
...
prun --exec "./grid/grid_run.sh %IN $datamctag $SKIM" \
--rootVer=5.34/11 --cmtConfig=x86_64-slc5-gcc43-opt \
--inDS="${samples[$i]}${CONTAINER}" --nFiles 1 \
--outDS="user.${PUSER}.${outsample}" \
--outputs="AnalysisManager.data12_8TeV.ZZ.root" \
--nFilesPerJob 2 --mergeOutput \
--excludeFile=\*/obj,\*/src/\*Dict\*,\*/lib,RootCore/\*/StandAlone/\*,RootCore/\*/obj/\*,RootCore/\*/bin/\*,RootCore/RootCore/scripts/load_packages_C.so,RootCore/RootCore/lib/\*,RootCore/RootCore/include/\*,RootCore/RootCore/python/\*,AnalysisZmumu,AnalysisWmunu,AnalysisWW,AnalysisWZ,AnalysisHWW,AnalysisWZorHbb,AnalysisWjets,doc,AnalysisZZ/config/mc12_p1328\*,AnalysisZZ/config/data12_p1328\*,patches,PoD\* \
--extFile=RootCore/RootCore.par,RootCore/MuonEfficiencyCorrections/share/*,RootCore/MuonMomentumCorrections/share/*,\*.txt,ExtRootFiles,ExtRootFiles/\*,ExtRootFiles/pileup/\*.root,ExtRootFiles/muontrigger/\*.root,ExtRootFiles/electronSFcorrections/\*.root,RootCore/\*/\*/*.root \
...
For a successfull grid job, make sure in the prun command that:
- the output of your analysis code has the same name as declared in
--outputs
- the
--excludeFile
option does not contain any files that are needed by your job (but rule out all the files non-relevant to your analysis so that the submission tarball is lighter)
- the
--extFile
option contains all files that are needed by your job (see also relative comments above).
4) Submit the job from
ElectroweakBosons:
source ./scripts/setup_lxplus.sh
source ./grid/setup_prun.sh
./grid/prun.sh
NOTE: Use
--nFiles=1
to run a test on a single file; remove it to run on the full sample.
How to install and run POWHEG-BOX/ZZ
Setup Athena:
asetup 17.2.8.11,AtlasProduction,here,64
Download trunk of POWHEG-BOX, optionally remove unwanted sub-packages to reduce the size of the BOX:
svn co --username anonymous --password anonymous svn://powhegbox.mib.infn.it/trunk/POWHEG-BOX
cd POWHEG-BOX
rm -rf Dijet hvq gg_* tt* ST_* HJ* VBF_* Zj* Z2jet Z_ew-BMNNPV W*
Change directory:
cd POWHEG-BOX/ZZ
Edit Makefile, find and setup the following variables:
LHAPDF_CONFIG=$(ATLAS_EXTERNAL)/../sw/lcg/external/MCGenerators_hepmc2.06.05/lhapdf/5.8.5/x86_64-slc5-gcc43-opt/bin/lhapdf-config
FASTJET_CONFIG=$(ATLAS_EXTERNAL)/../sw/lcg/external/fastjet/2.4.4/x86_64-slc5-gcc43-opt/bin/fastjet-config
Compile and run:
make pwhg_main
cp -r test test1
cd test1
../pwhg_main
The input options are given in powheg.input (find an example inside the test directory), and the output is a Les Houches Event file, "pwgevents.lhe". TIP: In powheg.input, setup the PDF for the colliding protons to set of choice. For CT10 use:
lhans1 10800
lhans2 10800
Also, set the colliding beam energies to 4000
GeV (default is 3500
GeV). Factorization and renormalization scales are also set in this file.
To shower events with Pythia in Athena, do the following. Prepare the Les Houches file for Athena scripts:
cp pwgevents.lhe user.inomidis.Powheg_CT10.126938.000001.pwgevents._1.events
tar -czvf user.inomidis.Powheg_CT10.126938.000001.pwgevents._1.tar.gz user.inomidis.Powheg_CT10.126938.000001.pwgevents._1.events
rm user.inomidis.Powheg_CT10.126938.000001.pwgevents._1.events
Setup this variable and get the job-options (this hack is needed for 17.2.X.Y, usual scripts don't work):
export JOBOPTSEARCHPATH=/cvmfs/atlas.cern.ch/repo/sw/Generators/MC12JobOptions/latest/common:$JOBOPTSEARCHPATH
cp /cvmfs/atlas.cern.ch/repo/sw/Generators/MC12JobOptions/latest/share/DSID126xxx/MC12.126938.PowhegPythia8_AU2CT10_ZZ_2e2mu_mll4_2pt5.py .
Run transformation script:
Generate_trf.py ecmEnergy=8000 runNumber=126938 firstEvent=1 maxEvents=-1 randomSeed=1234 jobConfig=MC12.126938.PowhegPythia8_AU2CT10_ZZ_2e2mu_mll4_2pt5.py outputEVNTFile=Powheg.pool.root inputGenerat
orFile=user.inomidis.Powheg_CT10.126938.000001.pwgevents._1.tar.gz postExec='ServiceMgr.MessageSvc.enableSuppression=False'
The output is Powheg.pool.root which contains the
McEventCollection GEN_EVENT (same as GEN_AOD). Analyze with Athena classes, or convert to
D3PD and analyze with
ROOT.
How to install MCFM
MCFM is a program designed to calculate the cross sections of various processes at hadron-hadron colliders. Its documentation can be found
here. MCFM depends on two external programs: CERNLIB and LHAPDF.
The following guide is going to use $HOME/.local for the installation of the necessary libraries and of the PDF sets. Add the following lines to your shell:
export PATH="$PATH:$HOME/.local/bin"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$HOME/.local/lib"
Restart your shell and create the directory $HOME/.local .
How to install CernLib
CERNLIB is a (deprecated) collection of libraries and modules offered by CERN's central computers.
In order to install
CernLib, change directory to $HOME/.local and retrieve the necessary files:
Extracting all three of them will create a directory named 2006b. After their extraction, the gziped files can be safely deleted.
How to install LHAPDF
LHAPDF provides a unified and easy to use interface to modern PDF sets. It is designed to work not only with individual PDF sets but also with the more recent multiple "error" sets.
Its installation will take place in the same directory.
If the installation occurs without errors, LHAPDF is ready to be used. You can use the command
lhapdf-getdata to download PDF sets. You either save them under a folder
PDFsets in the $HOME/.local folder and then symbolic link to MCFM/Bin/PDFsets, or you can download them directly in MCFM/Bin/PDFsets :
- mkdir $HOME/.local/PDF_sets && cd !$
- lhapdf-getdata CT10
Install MCFM
It is now time to install MCFM. First, download and extract the source:
Now run ./Install . Once it is finished, enter the makefile and edit the CERNLIB and PDFLIB as such
CERNLIB = /afs/cern.ch/user/FIRST_LETTER/USERNAME/.local/lib/cernlib-2006b/x86_64-slc5-gcc41-opt/lib
LHAPDFLIB = /afs/cern.ch/user/FIRST_LETTER/USERNAME/.local/lib
PDFROUTINES = LHAPDF
NTUPLES = YES
Save & exit the file and compile:
EWUnfolding
This section describes how to get the EWUnfolding code from
SVN, compile it and use it. It has been tested on lxplus5, but not 6.
Check out code:
svn co svn+ssh://USER@svn.cern.ch/reps/atlasphys/Physics/StandardModel/ElectroWeak/Analyses/EWUnfolding/branches/EWUnfolding-00-00-01-branch/ EWUnfolding/
(branch-00-00-01 allows histogram input instead of branch input in the Unfolding code, reducing drastically SFrame output root file size)
Compile it:
In the line below, gcc34 IS NOT A TYPO! All the symlinks under /afs/cern.ch/sw/lcg/external/gcc/ point to gcc34!
# gcc 4.3.5
source /afs/cern.ch/sw/lcg/external/gcc/4.3.5/x86_64-slc5-gcc34-opt/setup.sh
# ROOT 5.34.03
cd /afs/cern.ch/sw/lcg/app/releases/ROOT/5.34.03/x86_64-slc5-gcc43-opt/root/
source bin/thisroot.sh
cd ~
cd EWUnfolding/branches/EWUnfolding-00-00-01-branch/Code/
# Compile RooUnfold:
cd RooUnfold
source setupRooUnfold.sh
# Set up RootCore:
cd external/RootCore
./configure
# Compile BootstrapGenerator using RootCore:
cd ..
source RootCore/scripts/setup.sh
RootCore/scripts/find_packages.sh #do not source
RootCore/scripts/compile.sh #do not source
# Compile EWUnfold:
cd ..
make
# Set LD_LIBRARY_PATH to use BOOTSTRAP
source external/RootCore/scripts/setup.sh
(it would be handy to copy-paste all of the above and put in in a script, i.e "unfold.sh")
To run, simply issue:
./EWUnfoldBase config/unfolding_steering_file.xml
Documentation: The ATLAS style, nearly none! There are two talks, one by Matthias Schott:
https://indico.cern.ch/getFile.py/access?contribId=5&resId=0&materialId=slides&confId=197093
and one by Adrian Lewis:
https://indico.cern.ch/getFile.py/access?contribId=9&sessionId=1&resId=0&materialId=slides&confId=210995
Tips'n'Tricks
Sane lxplus usage
We're supposed to be submitting heavy cpu/memory jobs to the lxbatch system and NOT on the lxplus. However, sometimes it's virtually impossible to avoid running a job that brings an lxplus node to it's knees.. In order to avoid warnings from the IT department, use the
nice
command, right before executing your command. For instance, for an SFrame job:
nice -n 10 sframe_main config/your_steering_file.xml
This will change your "niceness" from 0 (default value) to 10. The niceness of a user varies from -20(most favorable) up to 19(least favorable) and it works as an "advisor" to the scheduler running on lxplus. Only a root user can lower niceness.
PoD - How many workers?
More workers = less running time but also more merging time. So a balance is needed, what you gain in running time you loose it at merging. Personal experience: 60-80 workers are more than enough!
Analysis on lepton ntuples
See
https://twiki.cern.ch/twiki/bin/view/Main/AUThAtlasGroupPage/LeptonNtupleAnalysis
Setup ZZ->4l Analysis code
For SM ZZ analysis, see
https://twiki.cern.ch/twiki/bin/view/Main/AUThAtlasGroupPage/ZZ4lAnalysisCodeAtAUTh.
Contact:
Vasiliki Kouskoura ,
Ioannis Nomidis.
For Higgs/inclusive four-leptons analysis, see
https://twiki.cern.ch/twiki/bin/view/Main/AUThAtlasGroupPage/FourLeptons4lAnalysisCodeAtAUTh.
Contact:
Dinos Bachas ,
Ioannis Nomidis.
Matrix-element weights with MadWeight
Running the
MadWeight tutorial
Useful links