WH boosted challenge page
The page
https://twiki.cern.ch/twiki/bin/viewauth/AtlasProtected/BoostedH2bb contains all the updated cut flows from us and UCL and their efficiencies.
This is the older page
https://twiki.cern.ch/twiki/bin/view/AtlasProtected/BoostedH2bbFall2011.
Locally stored samples
In the following the list of the locally stored samples produced with the latest version of
HSG5D3PD maker (v09132011).
Storage folder |
Sample |
Skim |
/Disk/speyside8/lhcb/atlas/ |
ttbar (T1) McAtNlo |
WH, ZHll |
|
ZHllbb (mH=110,115,120,125,130,140) pythia |
ZHll, WH |
|
ZZllqq McAtNLO Jimmy |
ZHll, WH |
|
ZHnunubb (mH=120) pythia |
ZHmet |
|
ZZ2l2tau Mc@NLO Jimmy |
ZHll |
/Disk/speyside7/Grid/grid-files/Higgs/D3PDs/GroomedJetSamples/ |
W+Jets AlpgenJimmy Npx (x=0,..,3) |
WH |
|
WZ Herwig pt>100GeV |
WH, ZHLL |
|
WpZ_lnuqq McAtNlo_JIMMY |
WH, ZHLL |
|
WHlnubb (mH=120) Herwig (boosted) |
WH |
|
WHlnubb (mH=120) Pythia (low stat) |
WH |
|
WHlnubb (mH=120) Pythia (high stat) |
WH |
The table will be updated as soon as new samples are transferred.
The file lists corresponding to the samples are stored in svn and can be found in the folder:
VHanalysis/samples/GroomedJetFileLists/
Samples available UKI-SCOTGRID-ECDF_LOCALGROUPDISK
Production samples only.
MC:
dq2-list-dataset-site UKI-SCOTGRID-ECDF_LOCALGROUPDISK | grep p782
mc11_7TeV.107280.AlpgenJimmyWbbFullNp0_pt20.merge.NTUP_HSG5WH.e887_s1310_s1300_r2730_r2780_p782_tid573535_00
mc11_7TeV.107283.AlpgenJimmyWbbFullNp3_pt20.merge.NTUP_HSG5WH.e887_s1310_s1300_r2730_r2700_p782_tid573538_00
mc11_7TeV.107281.AlpgenJimmyWbbFullNp1_pt20.merge.NTUP_HSG5WH.e887_s1310_s1300_r2730_r2700_p782_tid573536_00
mc11_7TeV.107282.AlpgenJimmyWbbFullNp2_pt20.merge.NTUP_HSG5WH.e887_s1310_s1300_r2730_r2780_p782_tid573537_00
Data:
dq2-list-dataset-site UKI-SCOTGRID-ECDF_LOCALGROUPDISK | grep p766
PeriodK Egamma and Muon streams.
See below for how to generate file list from dataset name.
Samples available on the grid
Most of the samples stored locally (as well as the
complete ttbar one can be found on the grid - and will soon be moved to LOCALGROUPDISK at ECDF). They correspond to the following container names:
user.chiarad.mc10_7TeV.116125.HerwigWZ_pt100.HSG5D3PD.v09132011.merge/
user.chiarad.mc10_7TeV.105940.McAtNlo_JIMMY_WpZ_lnuqq.HSG5D3PD.v09132011.merge/
user.chiarad.mc10_7TeV.109140.WH120lnbb_Herwig.HSG5D3PD.v09132011.merge/
user.chiarad.mc10_7TeV.109352.WH120lnubb_pythia.HSG5D3PD.v09132011.merge/
user.chiarad.mc10_7TeV.116591.WH120lnubb_pythia.HSG5D3PD.v09132011.merge/
user.chiarad.mc10_7TeV.107280.AlpgenJimmyWbbFullNp0_pt20.HSG5D3PD.v09132011.merge/
user.chiarad.mc10_7TeV.107281.AlpgenJimmyWbbFullNp1_pt20.HSG5D3PD.v09132011.merge/
user.chiarad.mc10_7TeV.107282.AlpgenJimmyWbbFullNp2_pt20.HSG5D3PD.v09132011.merge/
user.chiarad.mc10_7TeV.107283.AlpgenJimmyWbbFullNp3_pt20.HSG5D3PD.v09132011.merge/
user.chiarad.mc10_7TeV.105200.T1_McAtNlo_Jimmy.HSG5D3PD.v09132011.merge/
The list of all files (including data) centrally produced with
HSG5D3PD maker so far can be found typing in AMI search the string %HSG5WH%p717%
Missing background MC samples:
Still need to be processed: WW, single top, W+c-jets, W+light jets
WH analysis result samples location
The samples produced running WHanalysis code are stored in:
/Disk/speyside7/Grid/grid-files/Higgs/OutputFiles
Both the .root and .txt files are stored, the first containing histograms and output tree (useful for running
TMVA) and the second to study cuts efficiencies.
New VHanalysis code (using RootCore)
First, you need to have a recent version of
ROOT setup. The recommendation is v5.26-v5.32. I used:
localSetupROOT --rootVersion=5.30.04-i686-slc5-gcc4.3
(It might be ok to use the default localSetupROOT... I didn't try it.)
You also need to set a global variable CERN_USER if your CERN user name is different from $USER:
>
export CERN_USER=
Now you can check out a single package and run a script that will do everything else for you. The following will create a directory Edinburgh_H2bbAnalysis, check out the necessary packages, and compile everything:
>
mkdir Edinburgh_H2bbAnalysis
>
cd Edinburgh_H2bbAnalysis
>
svn co svn+ssh://$CERN_USER@svn.cern.ch/reps/atlasoff/PhysicsAnalysis/HiggsPhys/HSG5/Hbb/Edinburgh_H2bbAnalysis/RootCore/trunk H2bbRootCore
>
cd H2bbRootCore/share
>
./build-all_WH.sh
After this, if you want to rebuild, you'll need to go to the Edinburgh_H2bbAnalysis directory and do:
>
source RootCore/scripts/setup.sh
To tag a new version of code. First commit to the trunk.
svn up
svn status
svn ci -m "I did this and this"
Then copy the trunk to a new tag
svn cp svn+ssh://$CERN_USER@svn.cern.ch/reps/atlasoff/PhysicsAnalysis/HiggsPhys/HSG5/Hbb/Edinburgh_H2bbAnalysis/WHanalysis/trunk -r [REVNUMBER] svn+ssh://wbhimji@svn.cern.ch/reps/atlasoff/PhysicsAnalysis/HiggsPhys/HSG5/Hbb/Edinburgh_H2bbAnalysis/WHanalysis/tags/[TAGNAME]
where [REVNUMBER] was the number given after the commit and [TAGNAME] is of the form WHanalysis-XX-YY-ZZ
Please browse the trac to see previous versions and also say what you did in the
ChangeLog
https://svnweb.cern.ch/trac/atlasoff/browser/PhysicsAnalysis/HiggsPhys/HSG5/Hbb/Edinburgh_H2bbAnalysis/WHanalysis/tags/
Updating packages after transition to a new Edinburgh_H2bbAnalysis/RootCore tag
(These instructions adapted from the
TopRootCore twiki.)
If you already have a working
RootCore setup and want to transition to a more recent
RootCore tag (i.e. a new list of recommended package versions), the following commands might be useful for you to update all of affected packages in one go. Local changes you may have made will not be overwritten (svn will merge simple conflicts, and ask for your input in more complicated cases).
#run this from the Edinburgh_H2bbAnalysis directory
echo "switching packages to recommended tags"
for package in `cat H2bbRootCore/share/packages.txt | grep -v \# | grep tags`; do dir=`echo $package | sed -e 's;.*/tags/;;' -e 's;\(-..\)\{3,\};;'`; cmd="svn switch svn+ssh://svn.cern.ch/reps/$package $dir"; echo $cmd; eval $cmd; done
echo "updating packages that are using the trunk version"
for package in `cat H2bbRootCore/share/packages.txt | grep -v \# | grep trunk`; do dir=`echo $package | sed -e 's;.*/\([a-zA-Z0-9_]*\)/trunk;\1;'`; cmd="svn update $dir"; echo $cmd; eval $cmd; done
You probably want to cleanup and rebuild the packages after this:
RootCore/scripts/clean.sh
RootCore/scripts/build.sh
An example on how to run WHanalysis within the new framework
After following the previously described instructions, enter the directory WHanalysis. To setup the
GoodRunLists package:
source scripts/setup.sh
If it is the first time running the package, it is necessary to create the output directories:
mkdir files
Then, it is finally possible to run the analysis:
./bin/WHanalysis_x inputfilename outputfilename [opt1] [opt2]
inputfilename is taken from samples which is now held on speyside rather than in
SVN
/Disk/speyside7/Grid/grid-files/Higgs/D3PDs/samples/mc11_c_samples
The first time your jobs run the validation part will crash if it does not find reference histos. To solve that mv the outputfile to a new refence.
cp files/outputfilename_hists.root files/mc11_107280_ref_hists.root
Running WH analysis code on the grid with RootCore
Basic use:
The script
SubmitToGrid.sh in the WHanalysis/scripts directory can be used to submit an analysis job to the grid. It uses prun with the --useRootCore flag, which will result in all
RootCore directories being copied to the directory from which you submit the gird job. For this reason, DO NOT use this script in one of the
RootCore directories.
In the Edinburgh_H2bbAnalysis directory, do the following:
>
mkdir gridsubmit
>
cd gridsubmit
>
ln -s ../WHanalysis/scripts/SubmitToGrid.sh .
>
ln -s ../WHanalysis/scripts/RunJob.sh .
>
ln -s ../WHanalysis/bin/WHanalysis_x .
>
ln -s /path/to/cuts.txt .
>
mkdir files
>
mkdir samples
The syntax should be as follows:
./SubmitToGrid.sh <datasetname> <outputname> <gridusername>
For example:
./SubmitToGrid.sh data11_7TeV.00184169.physics_Egamma.merge.NTUP_HSG5WH.r2603_p659_p717/ 10102011_00184169_Egamma_6 obrien
The script takes three arguments.
1.) The dataset to be run on
with the trailing slash. In principle you can also supply it with a local samples files list and the script will find the relevant dataset and use the grid.
2.) The output filename stub.
3.) Your grid username.
How it works:
The above translates into the prun command:
prun --exec ./RunJob.sh %IN 10102011_00184169_Egamma_6 --useRootCore --inDS data11_7TeV.00184169.physics_Egamma.merge.NTUP_HSG5WH.r2603_p659_p717/ --outDS user.obri
en.10102011_00184169_Egamma_6 --athenaTag=17.0.2 --outputs 10102011_00184169_Egamma_6.root,10102011_00184169_Egamma_6.txt
RunJob.sh creates a file list from the input dataset. (Note that if WHanalysis is able to take a comma-separated list of files as its first argument, there will be no need to call
RunJob.sh.)
In
SubmitToGrid.sh you can also modify an argument to exclude grid sites, or to add your own requirements e.g. limit the number of files in the dataset to run on.
VHanalysis code
Do either a CMT setup on lxplus or localSetupROOT on a local machine.
For now on local machines you have to set
ROOT to v5.30 for the pileup reweighting to compile:
localSetupROOT --rootVersion=5.30.04-i686-slc5-gcc4.3
Then check out VHanalysis:
mkdir VHanalysis
cd VHanalysis
SVNROOT_EDINBURGH=svn+ssh://<username>@svn.cern.ch/reps/edinburgh/Atlas/Higgs/Boosted
svn co $SVNROOT_EDINBURGH/VHanalysis/VHtools/trunk VHtools
svn co $SVNROOT_EDINBURGH/VHanalysis/samples/trunk samples
Check out
RootCore and configure:
svn co svn+ssh://
@svn.cern.ch/reps/atlasoff/PhysicsAnalysis/D3PDTools/RootCore/tags/RootCore-00-00-29 RootCore
cd RootCore
./configure
cd -
Check out the GoodRunsLists:
svn co svn+ssh://<username>@svn.cern.ch/reps/atlasoff/DataQuality/GoodRunsLists/trunk GoodRunsLists
Check out the PileupReweighting tool:
svn co svn+ssh://<username>@svn.cern.ch/reps/atlasoff/PhysicsAnalysis/AnalysisCommon/PileupReweighting/trunk PileupReweighting
Check out the MuonEfficiencyCorrections tool:
svn co svn+ssh://<username>@svn.cern.ch/reps/atlasoff/PhysicsAnalysis/MuonID/MuonIDAnalysis/MuonEfficiencyCorrections/tags/MuonEfficiencyCorrections-01-01-00 MuonEfficiencyCorrections
Then follow the instructions for one of the following packages:
WH analysis
Check out WHanalysis and compile all packages:
svn co $SVNROOT_EDINBURGH/VHanalysis/WHanalysis/trunk WHanalysis
cd WHanalysis
On a local machine, do localSetupROOT
. On lxplus, do a CMT setup so you have ROOT.
Now typing make
in the WHanalysis directory compiles the GoodRunsLists and VHtools packages:
source setup.sh
make
You can check out the VHanalysis/samples
directory to get lists of files copied to /Disk/speyside7
:
svn co $SVNROOT_EDINBURGH/VHanalysis/samples/trunk samples
To run over a filelist, do:
./WHanalysis_x <filelist> <outputname>
Output histograms and the output tree will be in the files
directory.
To test against a reference file, do:
./test_against_reference.sh
A set of .eps files will appear in the files
directory containing histograms for the current code plotted with the reference histograms.
ZH -> llbb analysis
Follow the instructions for WHanalysis above, except checkout the following package instead of WHanalysis:
svn co $SVNROOT_EDINBURGH/VHanalysis/ZHllbb_analysis/trunk ZHllbb_analysis
Submitting grid jobs for AOD->D3PD
These instructions are for lxplus. It copies a tarball from /afs/cern.ch/user/r/roberth/public/gridjobs containing a .tgz and configuration file that are submitted directly to the grid using pathena. You don't need to unzip and untar the .tgz file and compile to submit the job, but you can if you want to.
mkdir gridjobs
cd gridjobs
cp ~roberth/public/gridjobs/submit_all.tar.gz .
tar zxvf submit_all.tar.gz
rm -f submit_all.tar.gz
mv submit_all/setup .
source setup; # sets up 17.2.0.1.1,AtlasPhysics
voms-proxy-init --voms atlas
Enter your password.
The submission script takes 2 arguments: the name of the text file (in the 'data' directory) containing the dataset definitions you intend to run over, and your VOMS nickname (as in user.XXXX, used in the output dataset definition).
The next commands submit the jobs, one job for each file in the sample list.
cd submit_all
./submit_all.sh <sample> <VOMS nickname>
Monitor the progress of your jobs here:
http://panda.cern.ch:25980/server/pandamon/query?ui=users
(Search for your name, probably all lower case but maybe not.)
If one or more of the jobs fail, you can resubmit the failed jobs if you know the JobID. The JobId is shown after your name in the panda monitor, or you can do pbook
on lxplus (after dq2 setup) to get the list of all the jobs you've run with the JobIDs. To retry:
pbook
>> retry(<JobID>)
To merge the files after all are done, do the following:
./merge_all.sh <sample> <VOMS nickname>
Filtering D3PDs locally
Working with files on ECDF locally
Making a list of the files on ECDF
setupATLAS
localSetupDQ2Client
voms-proxy-init -voms atlas
dq2-ls -f -p -L UKI-SCOTGRID-ECDF_LOCALGROUPDISK mc10_7TeV.116591.WH120lnubb_pythia.merge.NTUP_HSG5WH.e701_s933_s946_r2302_r2300_p766/ | grep srm | sed 's/srm:\/\/srm.glite.ecdf.ed.ac.uk/rfio:\/\/\//g'
To copy the file from ECDF locally
localSetupGLite
export DPM_HOST=srm.glite.ecdf.ed.ac.uk
export DPNS_HOST=srm.glite.ecdf.ed.ac.uk
rfcp /dpm/ecdf.ed.ac.uk/home/atlas/atlaslocalgroupdisk/mc10_7TeV/NTUP_HSG5WH/e701_s933_s946_r2302_r2300_p766/mc10_7TeV.116591.WH120lnubb_pythia.merge.NTUP_HSG5WH
.e701_s933_s946_r2302_r2300_p766_tid563989_00/NTUP_HSG5WH.563989._000001.root.1 /scratch/bob.root
To run on the file in ROOT locally.
localSetupGLite
voms-proxy-init -voms atlas
export DPM_HOST=srm.glite.ecdf.ed.ac.uk
export DPNS_HOST=srm.glite.ecdf.ed.ac.uk
export LD_LIBRARY_PATH=./:$LD_LIBRARY_PATH
ln -s /Disk/speyside4/atlas/ddm/libdpm.so.1.7.4 libshift.so.2.1
ln -s /Disk/speyside4/atlas/ddm/liblcgdm.so.1.7.4 ./liblcgdm.so
root
root [0] TFile *_file0 = TFile::Open("rfio:////dpm/ecdf.ed.ac.uk/home/atlas/atlaslocalgroupdisk/mc10_7TeV/NTUP_HSG5WH/e701_s933_s946_r2302_r2300_p766/mc10_7TeV.116591.WH120lnubb_pythia.merge.NTUP_HSG5WH.e701_s933_s946_r2302_r2300_p766_tid563989_00/NTUP_HSG5WH.563989._000001.root.1")
To filter the files.
Make a list of files as above (here called ListOfFilesRfio).
Change the path srm://srm.glite.ecdf.ed.ac.uk/ to rfio:///
sed -i 's/srm:\/\/srm.glite.ecdf.ed.ac.uk/rfio:\/\/\//g' samples/data11_7TeV.PeriodK.physics_Egamma.merge.NTUP_HSG5WH.r2713_p705_p766
Make a list of branches we need with :
grep SetBranchAddress VHanalysis/VHtools/VHReader.cpp| grep -v "//" | cut -d "," -f 1 | cut -d "\"" -f2
localSetupGLite
asetup 17.0.3,here
filter-and-merge-d3pd.py -i ListOfFilesRfio --out=/scratch/mc10_7TeV.116591.WH120lnubb_pythia.merge.NTUP_HSG5WH.e70
1_s933_s946_r2302_r2300_p766_Filtered.root -t physics --var=/phys/linux/wbhimji/VHanalysis/WHanalysis/ListOfBranches --keep-all-trees
Filtering Files using eddie / ecdf
Now using prun pointing to ecdf.
Make a list of RunListEgamma is a list of the runnumbers for each of which there is a fileList already made and on eddie in the location specficied below.
REMEMBER TO CHANGE THE OUTPUT DIRECTORY! AND ALSO TO MAKE THE NEW DIR (and make it world writable) BEFORE SUBMITTING THE JOB
for i in `cat RunListEgamma`; do prun --noBuild --site=ANALY_ECDF --outDS user.wbhimji.FilteroutEgamma-$i --exec “filter-and-merge-d3pd.py -i /exports/work/physics_ifp_gridpp_pool/Filter16Mar/samples/Egamma/fileList.$i.txt --out=/exports/work/physics_ifp_gridpp_pool/Filter16Mar/outputs/Egamma/Egamma.output.$i.root -t physics --var=ListOfBranches16Mar --keep-all-trees” --athenaTag=17.0.4; done
I am copying the files to somewhere that ppe machines can see them using
[wbhimji@frontend04 ~]$ ./sge_copy_to_NAS.sh /exports/work/physics_ifp_gridpp_pool/Filter7Jun/outputs /exports/nas/exports/cse/ph/ifpnp/PPE/atlas/users/Higgs/D3PD/Filter7Jun/outputs
where the copy to NAS script is just
rsync -av $1/* $2
but has some flags to run on batch (taken from ECDF twiki)
Instructions for running on ecdf batch:
The filtering commands are wrapped in jobSubFilt.sh
So make a directory called Filter and a sub-dir in that called samples.
Split the list of files
split -l 30 -d samples/Listoffile samples/Listoffile_Split
where 30 is the number of files per split file.
Then submit however many files to the batch queue with
for k in `seq 0 17` ; do qsub -N data11.PeriodK.Egamma.Filt.$k jobSubFilt.sh Listoffile_Split$k done
check your jobs with qstat, the outputs will go in
/exports/work/physics_ifp_gridpp_pool/FilteredD3PDs/Listoffile_Spilt$k.root
from where you can scp them back to Speyside
Useful Web Links:
- Jet cleaning variables: good, bad & ugly jets definitions.
Meetings / Mailing Lists:
ATLAS UK Higgs
atlas-uk-higgs@SPAMNOTcernNOSPAMPLEASE.ch
Old Stuff
Using the Good Run List (GRL) in Edinburgh
Provided root etc are set up correctly, see here
(GRL only seems to work with Root v5.26.)
svn co svn+ssh://<username>@svn.cern.ch/reps/atlasoff/DataQuality/GoodRunsLists/tags/GoodRunsLists-00-00-84
cd GoodRunsLists-00-00-84/cmt
make -f Makefile.Standalone
cd ../../
ln -s GoodRunsLists-00-00-84 GoodRunsLists
export GRL_LIB_DIR=${PWD}/GoodRunsLists/StandAlone
export LD_LIBRARY_PATH=${GRL_LIB_DIR}:${LD_LIBRARY_PATH}
For subsequent uses, you will need to add the following lines to your .bashrc file:
export GRL_LIB_DIR=<path-to-your-myAnalysis-directory>/GoodRunsLists/StandAlone
export LD_LIBRARY_PATH=${GRL_LIB_DIR}:${LD_LIBRARY_PATH}
Extra GRL info, for independent users.
You can download the GRL xml files from the links on the HSG5DPD link below. Although, VJM has added some of the GRL files to the code in the tgz file. Please check if GRL files are up-to-date ones!
GRL code at CERN:
svn co svn+ssh://<username>@svn.cern.ch/reps/atlasoff/DataQuality/GoodRunsLists/tags/GoodRunsLists-00-00-84
cd GoodRunsLists-00-00-84/cmt
make -f Makefile.Standalone
You'll then need to edit your Makefile and LD_LIBRARY_PATH to pick up the GRL libraries, see example above. (This seems to only works with Root v5.26.)
Release-16 code in SVN
Based on the above a code to make the cuts for the WinterMiniChallange is now in (the Edinburgh) SVN
svn co svn+ssh://<username>@svn.cern.ch/reps/edinburgh/Atlas/Higgs/d3PDCutter/
then you will need to also check out the GoodRunsList code (I advise doing so in a seperate directory to avoid committing into this repo) and make
ln -s Wherever/GoodRunsLists GoodRunsLists
source scripts/SetupAnal.sh
./bin/D3PDCutter scripts/HSG5MiniChallangeMCFiles
The argument is a text file with the list of root files to run over. It will take a long time - so you can either split that file up (using for example the command "split") or you can do
./bin/D3PDCutter scripts/HSG5MiniChallangeMCFiles 10 10
where the second optional argument is the starting event and the third is the number of events to run over.
Working with the Code for Boosted Higgs
setupATLAS
asetup 16.0.2,here
cmt co -r JetMomentTools-00-00-18 Reconstruction/Jet/JetMomentTools
svn co svn+ssh://username@svn.cern.ch/reps/atlasgrp/Physics/Exotic/Common/BoostedObjects/JetSubstructureD3PDMaker/tags/JetSubstructureD3PDMaker-00-00-02
Then edit QcdD3PD_AOD_MC_Filt.py in JetSubstructureD3PDMaker to point at some MC locally.
There is Hbb signal MC AOD at
/Disk/speyside7/Grid/wbhimji/mc09_7TeV.109352.WH120lnubb_pythia.recon.AOD.e614_s765_s767_r1302_tid173802_00/
cd Reconstruction/Jet/JetMomentTools/cmt
gmake
athena.py QcdD3PD_AOD_MC_Filt.py
-- VictoriaMartin - 19-Oct-2012