Information of the CERN Tier 3 for the ATLAS Physics group AtlasCERNCat

You can find detailed information about our Tier 3 under

Further links

This tutorial should just give you a rough feeling of how to use our batch system with and without Ganga. There are much more detailed Ganga tutorials around, which you should also consult - here are some links

Working with our group scpace and xrootd servers

The setup is done via

export STAGE_HOST=castoratlast3
export STAGE_SVCCLASS=atlascernuserdisk

Note that export STAGE_SVCCLASS=atlaslocalgroupdisk seems not to be relevant anymore. Then you can simply use the usual rfcp, rfdir, ... commands. The files can be accessed via

root://castoratlast3//castor/cern.ch/user/m/mschott/xrootdtest/dummyFile

Two things have to be noted: First you have to use root version >= 5.18e (for example, as with Athena rel 14.5.0.). Secondly you need to create a pool-file cataloge to use xrootd-files. This can be done via

pool_insertFileToCatalog 
root://castoratlast3//castor/cern.ch/user/m/mbaak/xrootd/AOD.023533._00001.pool.root.2

Working with the batch system

Basic Operations

In order to logon one of our interactive node, start from lxplus and type

bsub -Is -q atlasinter zsh
kinit

An overview on all running jobs in batch queues, can be seen via

bjobs

and we can kill a job via

bkill JOBID
where the parameter JOPID is the actual ID of the job (see bjobs command)

Every user should have its own directory on our Castor disk space. Before testing it, you may have to setup the Castor enviroment variables

export STAGE_SVCCLASS=atlt3

Now, we can create a directory on your CAT FS via

rfmkdir /castor/cern.ch/grid/atlas/atlt3/scratch/userName/Tutorial

and copy some input files there

rfcp /afs/cern.ch/atlas/groups/PAT/Tutorial/AtlasOffline-14.2.10/AODs/valid1.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s435_r432_tid022496/AOD.022496._00001.pool.root.1 /castor/cern.ch/grid/atlas/atlt3/scratch/mschott/Tutorial/T1_McAtNlo_Jimmy.recon.AOD.1.root
check via
rfdir /castor/cern.ch/grid/atlas/atlt3/scratch/mschott/Tutorial

Submitting batch jobs

I assume that we are working with Athena release 14.2.10 and it has already been setup, via

source ~/cmthome/setup.sh -tag=14.2.10,32
or similar commands on your account. In the tutorial we will work with the UserAnalysis-package which we will therefore check out in our Athena working directory, via
cmt co -r UserAnalysis-00-13-04 PhysicsAnalysis/AnalysisCommon/UserAnalysis
go to
cd PhysicsAnalysis/AnalysisCommon/UserAnalysis/cmt
and type
cmt config
source setup.sh
gmake
I guess the gmake step is not necessary, so you might skip this command. We change now to
cd PhysicsAnalysis/AnalysisCommon/UserAnalysis/share
and edit the following lines in the main joboption file Analysis.Skeleton_topOptions.py
// ServiceMgr.EventSelector.InputCollections = [ "AOD.pool.root" ]
ServiceMgr.EventSelector.InputCollections = ["/afs/cern.ch/atlas/groups/PAT/Tutorial/AtlasOffline-14.2.10/AODs/valid1.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s435_r432_tid022496/AOD.022496._00001.pool.root.1" ]
where we just give an example input file. Now we test the whole setup so far, via
athena.py AnalysisSkeleton_topOptions.py
root AnalysisSkeleton.aan.root

Now, we want to run the exact same job on our batch system. Go to run directory (cd ../run) and create batch.sh

export RFIO_USE_CASTOR_V2=YES
export STAGE_HOST=castoratlas
export STAGE_SVCCLASS=atlt3
export CURRENT_DIR=`pwd`
cp /afs/cern.ch/user/u/userid/Athena/14.2.10/PhysicsAnalysis/AnalysisCommon/UserAnalysis/share/AnalysisSkeleton_topOptions.py .
cp /afs/cern.ch/atlas/groups/PAT/Tutorial/AtlasOffline-14.2.10/AODs/valid1.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s435_r432_tid022496/AOD.022496._00001.pool.root.1 .
source ~/cmthome/setup.sh -tag=14.2.10,32
cd /afs/cern.ch/user/u/userid/Athena/14.2.10/PhysicsAnalysis/AnalysisCommon/UserAnalysis/cmt
source setup.sh
cd $CURRENT_DIR
athena.py AnalysisSkeleton_topOptions.py
rfcp AnalysisSkeleton.aan.root /castor/cern.ch/grid/atlas/atlt3/scratch/USER/Tutorial/AnalysisSkeleton.aan.root

The script copies the job-Option and the input-file to the local-batch job directory. Then it sets up athena and runs the athena job. Afterwards we copy back the outputfile. Obviously you have to modify the above script to your diretory structure. Before running we change the input and output-file location in Analysis.Skeleton_topOptions.py to

ServiceMgr.EventSelector.InputCollections = ["./AOD.022496._00001.pool.root.1" ]
ServiceMgr.THistSvc.Output = [ "AANT DATAFILE='./AnalysisSkeleton.aan.root' OPT='RECREATE'" ]
AANTupleStream.OutputName = './AnalysisSkeleton.aan.root'
to test our castor storage.

Finally, we can submit our job to our short batch queue, via

bsub -q atlascatshort source batch.sh
Jobs which will take more than 1hour should be sent to your long queue via
bsub -q atlascatlong source batch.sh

When job has finished, the output can be found at

rfdir /castor/cern.ch/grid/atlas/atlt3/scratch/USER/Tutorial

Since we have already a setup Athena enviroment and we are in the run-directory, we well just type

get_files -jo HelloWorldOptions.py
as we will need the HelloWorldOptions.py for our Ganga test.

Setup Ganga

Start again a clean session on lxplus (Currently ganga does not work on out atlt3 interactive notesm therefore we have to use lxplus) and setup the relevant scripts
source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh
If you start ganga for the first time, type
ganga -g
This creates a file, named .gangarc. Ganga will produce quite a lot of files during its runtime. The directory of the files can be changed by editing the .gangarc file and add the following line
gangadir = /afs/cern.ch/user/LETTER/NAME/PLACE/gangadir

Now we can start the first time Ganga, by simply typing

ganga

typing

jobs
will give you an opverview of your jobs. Jobs can be killed and removed via
jobs(JOBID).remove() 
To quit again, just press CTRL-D.

Simple Ganga Jobs

Now go again to the run directory
cd /afs/cern.ch/user/u/userid/Athena/14.2.10/PhysicsAnalysis/AnalysisCommon/UserAnalysis/run
and create myscript.sh
#!/bin/sh
echo 'myscript.sh running...'
echo "----------------------"
/bin/hostname
echo "HELLO!"
echo "----------------------"
and gangaScript.py. Do not forget to modify the following to your directory structure+
j = Job()
j.application=Executable()
j.application.exe=File('/afs/cern.ch/user/u/userid/Athena/14.2.10/PhysicsAnalysis/AnalysisCommon/UserAnalysis/run/myscript.sh')
j.backend=LCG()
j.submit() 
The imprtant point is here that we have chosen the LCG grid as backend, i.e. the script will be executed on the grid. Now start ganga again and submit the job to the LCG-grid
execfile("./gangaScript.py")
the status can be tested with
jobs
Zou can see the output of the job when it has finished under
$HOME/gangadir/workspace/mschott/LocalAMGA/0
if 0 was the job ID. This was our first grid-job submitted via ganga!

Using Ganga on our batch system

Now, we setup Athena 14.2.10 again and go to the UserAnalysis package, which we have checked out before

source ~/cmthome/setup.sh -tag=14.2.10
cd /afs/cern.ch/user/u/userid/Athena/14.2.10/PhysicsAnalysis/AnalysisCommon/UserAnalysis/cmt
cmt config
source setup.sh
Currently there is a problem in Athena 14.x.y which introduces some Python conflicts. Here is a workaround. You have to delete /afs/cern.ch/sw/lcg/external/Python/2.5/slc4_ia32_gcc34/lib/python2.5 out of the Python Path Variable $PYTHONPATH. In order to so, just print it via
echo $PYTHONPATH
The take the above entry away and set the remain variable again
export PYTHONPATH=...
We change to the run directroy and create a file named gangaHello.py
config["Athena"]["ATLAS_SOFTWARE"] = "/afs/cern.ch/atlas/software/releases"
j = Job()
j.application=Athena()
j.application.exclude_from_user_area=["*.o","*.root*","*.exe"]
j.application.prepare(athena_compile=True)
j.application.option_file='$HOME/Athena/14.2.10/PhysicsAnalysis/AnalysisCommon/UserAnalysis/run/HelloWorldOptions.py'
j.application.max_events='10'
j.backend=Local()
j.submit() 
This Ganga Job means the following
  • Line 1 defines the job
  • Line 2 sets it as an Athena job
  • Line 3 makes a tarball of the local packages which are to be sent to the Grid for execution
  • Line 4 tells ganga that this package should be compiled at the host
  • Line 5 points the job to the job options
  • Line 6 sets the number of events
  • Line 7 tells the job to run on the Local machine
  • Line 8 submits the job
Now we are running a script locally, by starting ganga and type
execfile("./gangaHello.py")
To run this job on our batch system, we only have to change the following in the previously defined Ganga Job.
#j.backend=Local()
j.backend=LSF(queue='atlascatshort')

Athena Analysis within Ganga

Now we want to have our own analysis code running with ganga. In order to simulate that, we will just edit two lines (260) on PhysicsAnalysis/AnalysisCommon/UserAnalysis/src/AnalysisSkeleton.cxx to

  m_h_elecpt     = new TH1F("elec_pt","pt el",50,0,250.*GeV);
  sc = m_thistSvc->regHist("/AANT/GangaDemo/elec_pt",m_h_elecpt);
So, if the directory GangaDemo will appear in our output root file, then we are sure that our version has been used. Before starting, we test if everything is compiling
cd PhysicsAnalysis/AnalysisCommon/UserAnalysis/cmt
cmt config
source setup.sh
gmake

Again, We change to the run directroy and create a file named gangaAnalysis.py

config["Athena"]["ATLAS_SOFTWARE"] = "/afs/cern.ch/atlas/software/releases"
j = Job()
j.name='AnalysisExample'
j.application=Athena()
j.application.exclude_from_user_area=["*.o","*.root*","*.exe"]
j.application.prepare(athena_compile=True)
j.application.option_file='$HOME/Athena/14.2.10/PhysicsAnalysis/AnalysisCommon/UserAnalysis/share/AnalysisSkeleton_topOptions.py'
j.inputdata=ATLASLocalDataset()
j.inputdata.names=['/afs/cern.ch/atlas/groups/PAT/Tutorial/AtlasOffline-14.2.10/AODs/valid1.005200.T1_McAtNlo_Jimmy.recon.AOD.e322_s435_r432_tid022496/AOD.022496._00001.pool.root.1']
j.application.max_events='-1'
#j.splitter=AthenaSplitterJob()
#j.splitter.numsubjobs=2
#j.merger=AthenaOutputMerger()
j.outputdata=ATLASOutputDataset()
j.outputdata.outputdata=['AnalysisSkeleton.aan.root']
j.outputdata.location = '$HOME/Athena/Ganga_Output'
j.backend=LSF(queue='atlascatshort')
j.submit()
and run it in ganga via
execfile("./gangaAnalysis.py")

Ganga can also handle job-Splitting and job-Merging. For this, you just have to decomment the following lines in the above Job definition

#j.splitter=AthenaSplitterJob()
#j.splitter.numsubjobs=2
#j.merger=AthenaOutputMerger()
Obviously you have to use a larger input sample for this to be effective.

Further comments

Right now it is not (easily) possible to use files on Castor as input-collection. This should be solved within the comming weeks.

PROOF on our batch system

t.b.d.

Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2009-01-09 - MatthiasSchott
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback