ROOT tips

How to check duplicate events in a TTree?

root[0] myTree->SetScanField(0);
root[1] myTree->Scan("eventNb:runNb:LS:zVtx:Reco_QQ_4mom.M()"); >tree.log
$ wc -l tree.log
$ sort -u tree.log | wc -l
If the tree has duplicated events, both wc -l will not give same number of lines.

git commands

  • Download files from GitHub servers:
$ git cms-init
$ git remote add <repository-name> <HTTPS clone URL>
   $ git remote add CMS-HIN-dilepton git@github.com:CMS-HIN-dilepton/cmssw.git
$ git remote show origin : Try to checkout a remote branch after your local git repo is aware of that. Check if the local is aware of branches.
$ git remote update
$ git fetch
   $ git fetch <repository-name> : Check branches/tags of the repository 
$ git branch : To check which branches are available. Chosen one is starred.
   $ git branch -d <local-branch> : Delete a local branch
$ git checkout <branch-name> : Change between branches
   $ git checkout HIN14015CodeSet/master
$ git checkout -b <localBranchName> <repo-name>/<branch-name>
   $ git checkout -b localBranchTest CMS-HIN-dilepton/onia_b20150311
$ git branch <localNewBranchName>
$ git push <remote-repo-name> <localNewBranchName> : create a new branch and commit to remote repo
   $ git push  <REMOTENAME> <LOCALBRANCHNAME>:<REMOTEBRANCHNAME>

$ git cms-addpkg FWCore/Version : Download a directory from the chosen branch

$ git clone git@github.com:<your-user-name>/usercode src/UserCode/<your-name>
$ git clone username@host:/remote/repository/location
$ git clone -b <branch-name> --single-branch <remote-repo-address> : clone 1 branch from remote repo
   $ git clone -b CMSSW_7_5_X_stdMuVal --single-branch  https://github.com/MiheeJo/cmssw.git

  • To clone a repository into local, but to work with sub-directories
    • This only connects remote repository, not clone them all into local directories
$ git init <directory-name>
$ cd <directory-name>
$ git remote add -f <directory-name> <git clone URL (same as HTTPS address)>
   $ git remote add -f CMS-HIN-Dimuons git://github.com/CMS-HIN-dilepton/Dimuons.git

  • Delete wrong changes/checkouts/commits/branches
$ git checkout -- <directory> : Overwrite local files with those from server, local changes will be abandoned.
$ vi .git/info/sparse-checkout : Open this file and delete the directory you don't want to follow
$ git read-tree -mu HEAD : Run this command will actually remove the directory which is deleted from ".git/info/sparse-checkout" file.
$ git reset HEAD~1 : delete the last commit
$ git branch -d [localbranchname] : delete local branch
$ git push [remotename] :[branch] : delete remote branch

  • Add/modify files
$ git add <file-name>
$ git commit -a
$ git commit -m "leave message"
$ git commit --amend   : Merge commits and commit again

  • If git doesn't recognize changes on files for commit
$ git rm --cached <file-name>: This means to only remove the file from the index
$ git reset <file-name>: This tells git to reload the git index from the last commit

  • Upload changes to git server
$ git push <remote-name> <branch-name>
   $ git push origin master
   $ git push Macros HEAD:master    : When there's no branch in remote(=Macros) and push to the master
$ git push <remote-name> <local-branch-name>:<remote-branch-name>
$ git remote add origin <remote-server-address> : When this is the first time to tell the remote server address

  • Copy and sync what is in the server to local
$ git cms-merge-topic <pull request id>
$ git cms-merge-topic <github-user>:<branch-name>
   $ git cms-merge-topic -u CMS-HIN-dilepton:oniaHI_b20150311

  • Sync forked repo to its original repo link
$ git checkout <local-branch-name>
$ git pull https://github.com/ORIGINAL_OWNER/ORIGINAL_REPOSITORY.git <original-repo-branch-name>
$ git commit -m"updating forked repo"
$ git push <remote-repo-name> <local-branch-name>

$ ssh-keygen -t rsa -C "mihee.jo@cern.ch" (first skip, second enter password)
$ eval `ssh-agent -s` : Try "Could not open a connection to your authentication agent." is printed
$ ssh-add ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub : Copy and paste all contents to github ssh key section
$ ssh -vT git@github.com : Then add ssh key to github account
Move to CMSSW area
$ git cms-init

  • Compare 2 files in different branches (2 possible ways)
$ git diff branch1:file branch2:file
$ git diff branch1 branch2 myfile.cpp

Linux shell programs

xargs

  • ls | xargs -n 1 tar zxvf
  • cmsLs /store/user/miheejo/MC2011/NonPromptJpsi_template/ppReco/Tree/ | awk -v p="" '{if ($5!="") p=p" root://eoscms//eos/cms"$5}; END{print p}' | xargs -n 1 hadd /tmp/miheejo/NonPromptJpsi_template_ppReco_cmssw445p1.root

rsync

  • rsync -avr -e "ssh -l user" --exclude 'logs_*' ./bit1_weightedEff_* remote:/my/dir

diff

  • diff -ENwbur <dir1> <dir2> 
    : Diff all files in 2 directories
    • -q : Print only file names that have differences, -Ewb ignore the bulk of whitespace changes, -N detect new files, -u unified, -r recursive

Find

  • find fracfree_fin/__mb_pt/ -type f -name "2D_GG.txt" | wc -l
  • find . -exec grep -l "searching text" {} \; : this will print the file name that contains "searching text"
  • find . -name "*.bak" -type f -delete : Find a file recursively and delete. -delete should be at the end of the command otherwise everything will be deleted.
  • find . -type d -print -exec chmod 644 {} \; : Find a file recursively and change permission.

grep

  • grep -f file1 file2 : Compare files and print common contents. For exact matches (-w) without regex (-F) options can be used.

AWK

  • Extract .py file name from LSFJOB file
    From this string: cmsRun /afs/cern.ch/user/d/dmoon/scratch0/HI_TnP/cms442patch5/src/HiAnalysis/HiOnia/test/Batch/181910_182065/HisOnia1_11.py, get "HisOnia1_11.py" with below command. Above string is written in a "LSFJOB" file.
    awk -v p='cmsRun' '{if(p == $1) print $2}' < LSFJOB | awk 'BEGIN{FS="/"}; {print $NF}' | awk -v py='.py' '{print substr($1,0,index($0,py))}'
  • Get file names with regular expression.
    list=$(rfdir $indir | awk '{if (/pT[0-9.]+-[0-9.]+/) print $9}')
  • Rename files with simple regular expression. This changes old prefix of all files ("rp22_5000") to new one ("rp22").
    ls | awk -v p="" '{if (index($0,"rp22") != 0) {p=$0; gsub(/rp22_5000/,"rp22"); print "mv "p" "$0} }' > rename.sh
    chmod +x rename.sh
  • Read multiple files and combine each records into 1 record
     awk '{a=$0; if (getline < "f1") {print(a $0);}}' < f2 
    • other example:
      awk '{ a=$1; b=$2; if (getline < "file2.txt") { printf("%.2f %.2f\n", a/$1, b/$2); } }' < "file1.txt"
  • Read 2 files and compare contents, print missing lines in 2nd file
    • awk 'FNR==NR{a[$1" "$2]=$1" "$2; next} { if($1" "$2 in a){next}else{print $0} }' dsLxyzRe.txt dsLxyRe.txt  > compresult
    • Array a[..] has same index and content: $1" "$2.
    • 'next' is same as continue in c++: Stop processing and go to next record.
    • FNR is number of records for each file, NR is total records sum of all files
      • So FNR==NR{...} allows {...} to be executed only for 1st file

VI

  • Reuse matched pattern
\0 is the entire match, \1 is the first part of the matched pattern
  • Searching a pattern in a non-greedy way with regular expression
 
    ^# to match the # character anchored at the start of the line (this answers question 1)
    \s\+ to match any whitespace one or more times
    \( to start a group (this answers question 2)
    .\{-}\ to match any character 0 or more times in a non-greedy way; this is diffferent from .* in that it tries to match as little as possible, and not as much as possible.
    \) to end the subgroup.
    : matches a literal :

Convert

  • Merge multiple PDF files into one. But resolution is poorer.
    convert input1.pdf input2.pdf ... output.pdf
  • Convert pdf to png :
    ls | grep pdf | awk 'BEGIN{FS=".pdf"}; {print "convert -gravity center -define pdf:use-cropbox=true ./"$0" ./"$1".png"}' > run.sh

pdfnup

  • Need the pdfjam package. Merge several pdf files into 1 pdf file, with matrix format.
pdfnup --nup [column]x[row] --outfile [output file name] [input file 1] [input file 2] [input file 3] ... 

Kerberos and OpenAFS on (k)Ubuntu, Debian

These instructions work as-is on Ubuntu Karmic Koala (version 9.10) and Lucid Lynx (version 10.04). They have been tested on Ubuntu 12.04.2 LTS (Precise Pangolin) and work if you use apt-get instead of aptitude. This web page will give up-to-dated information regarding CERN kerberos. Original reference of this instruction can be found here. Or check OpenAFS wiki

  • Install the necessary packages.
    sudo apt-get install openafs-krb5 openafs-client krb5-user module-assistant openafs-modules-dkms

NOTE: The openafs-modules-dkms package automatically does the compiling and installation of the openafs kernel module. More importantly, it keeps the kernel module up-to-date as software updates upgrade the kernel.

  • When asked, supply these answers to the following questions:
AFS cell this workstation belongs to: cern.ch
Size of AFS cache: 512000 (this means 0.5GB, can be higher if you have the space)
DB server host names for your home cell: (blank)
Run openafs client now and at boot? (user preference)

  • Unfortunately, at its default resolution, debconf won't ask all the necessary questions on the first pass. You'll need to reconfigure. Run:
sudo dpkg-reconfigure krb5-config openafs-client

  • When asked, supply these answers to the following questions:
Default realm: CERN.CH (all caps)
Does DNS contain pointers to your realm's Kerberos Servers? Yes
AFS cell this workstation belongs to: cern.ch
Size of AFS cache: 512000 (this means 0.5GB, can be higher if you have the space)
Run openafs client now and at boot? (user preference)
Look up AFS cells in DNS? Yes
Encrypt authenticated traffic with AFS fileserver? Yes
Dynamically generate the contents of /afs Yes
Use fakestat? Yes
DB server host names for your home cell: (blank)
Run openafs client now and at boot? (user preference)

  • Compile the openafs kernel module, then restart the service. (This step is not necessary on recent versions with the openafs-modules-dkms package.)
sudo m-a prepare
sudo m-a auto-install openafs
sudo modprobe openafs

  • Restart service.
sudo service openafs-client restart

  • Things should be working. Test the functionality with:
kinit (cern username) && klist
aklog && tokens

NOTE: You will have to repeat the kernel module installation whenever the kernel gets upgraded. If, after a reboot, AFS stops working, type the following:

  ls /lib/modules/`uname -r`/fs/openafs.ko
If you get No such file or directory, you probably need to rebuild the openafs module for the currently-running kernel.

Programming languages

python

  • if not [check for check in ("0.0-1.2","0.0-2.4","1.2-1.6","1.6-2.4") if str(rap) in check]:
      continue
    • in can not be changed to is. It doesn't work.
  • pieces = [p for p in re.split("( |\t|\\\".*?\\\"|'.*?')", line) if p.strip()]
  • for thiscent in centrality:
      for thisrap in rapidity:
        open("fit_"+thiscent+"_"+thisrap,'w').close()
    

On lxplus (CAF)

  • cmsLs : list file name for in the CAF directories
    cmsLs /store/caf/user/miheejo/HIExpressPhysics/hiexp-hirun2011-r181530-reco-v1_2/
  • cmsStage, cmsRm : copy/remove CAF directory files to local directory. Use directory location same as cmsLs.
  • cmsMkdir
    /store/group/phys_heavyions/dileptons/MC2011/NonPromptJpsi_template/regit
  • Load logical files :
    root://eoscms//eos/cms/store/hidata/HIRun2011/HIMinBiasUPC/RAW/v1/000/182/296/F6A6C5C7-4615-E111-8E79-003048F11DE2.root
  • eos chmod 777 [eos directory] : change permission of eos directory
  • cmsPfn : will give the location of a root file in the eos area
  • cmsLs /store/user/miheejo/MCsample_pythiagun_445patch1/HiOnia2MuMuPAT | awk '{if ($5!="") print $5}' | xargs cmsRm
    : Erase files in a directory totally
  • To check the quota: eos quota | grep -A 4 "Quota Node: /eos/cms/store/caf/user/" | head -5

CMSSW related

  • edmProvDump : check edm root file's contents. Not only the process names but also their contents, name of sub-branches
  • edmPluginRefresh
  • edmPluginDump : shows available plugins under the cmssw area
  • Get step 1,2,3,4 files in certain release:
runTheMatrix.py -l 140 -n -e

LXPLUS Batch jobs

  • Change the queue of submitted jobs
bjobs | grep PEND | awk '{print $1}' | xargs -n 1 bmod -q 1nw

CRAB jobs

Installation

   $ wget http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab/Docs/CRAB_current.tgz
   $ ./configure

CRAB2 configuration

  • When "establishing gsissh controlpath" takes too much time: crab -cleanCache
  • Check integrated luminosity on the crab job
    • After retriving all outputs, get report.
       $ crab -c [crab job name] -report 
    • In the res/, json files are created. Run lumiCalc2.py.
       lumiCalc2.py -i [json file location] overview 
  • To submit CRAB jobs on CAF with multicrab
    • Main scripts are here
    • Need =multicrab.cfg=(should be), =crabCaf.cfg=(loaded from multicrab.cfg and its contents will be overwritten for comflict items)
      "multicrab.cfg"
      [MULTICRAB]
      cfg=crabCaf.cfg
      
      [COMMON]
      #CMSSW.dbs_url = http://cmsdbsprod.cern.ch/cms_dbs_caf_analysis_01/servlet/DBSServlet
      CMSSW.datasetpath=/HIExpressPhysics/HIRun2011-Express-v1/FEVT
      CMSSW.total_number_of_lumis       = -1
      CMSSW.lumis_per_job                 = 10
      CMSSW.pset             = HiTrigAna_data_fromReco_cfg.py
      USER.user_remote_dir          = HIExpressPhysics
      
      [hiexp-hirun2011-r181695-reco-v1-collisionEvents_lowerSC_autohlt]
      CMSSW.runselection          = 181695
      
      "crabCaf.cfg"
      [CRAB]
      jobtype      = cmssw
      scheduler    = caf
      #server_name  = caf
      
      [CAF]
      queue = cmscaf1nd 
      
      [CMSSW]
      #dbs_url = http://cmsdbsprod.cern.ch/cms_dbs_caf_analysis_01/servlet/DBSServlet
      datasetpath=/HIExpressPhysics/ComissioningHI-Express-v1/FEVT
      pset = HiTrigAna_data_fromRaw_cfg.py
      total_number_of_lumis   = 2
      lumis_per_job     = 1
      output_file = openhlt_data.root
      
      [USER]
      #additional_input_files = BSEarlyCollision120909.db
      copy_data = 1
      storage_element=T2_CH_CAF 
            

  • Prompt Reco and RAW data transferred to EOS can be accessed by jobs running on LXBATCH queues directly. CRAB can be used to send jobs to the local scheduler similarly to the way it sends jobs to the CAF, please make the following changes to your crab.cfg file like below. Then stageout to T2_CH_CERN or your CASTOR directory as you would normally.
    [CRAB]
    jobtype = cmssw
    scheduler = lsf
    [LSF]
    queue=<choose one of 1nd, 1nh, 1nw, 2nd, 2nw, 8nh, or 8nm depending on the type of job>
       
Edit | Attach | Watch | Print version | History: r54 < r53 < r52 < r51 < r50 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r54 - 2017-02-03 - MiheeJO
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback