Using the Condor batch system from your physics desktop

The Edinburgh Physics department uses the Condor batch control system. Condor manages a large pool of machines and assigns users' computing jobs to machines which are in an idle state. At the moment the queue is made out of 27 Linux boxes with Intel single processors located in the CP Lab. We have access to this facility which together with the local LHCb software installation can be used to run analysis jobs in batch mode. A job is usually said to be run in batch, rather than interactive, when you do not need to interacting with it and be logged in to the machine where the job is executed.

You can ignore the given example and use Ganga to submit your LHCb job. Make sure you run Ganga from the Condor frontend or ph-condor.ph.ed.ac.uk or a local node with Condor access (type gu.__location_backend() in Ganga to check) and set the backend to Condor(). See LHCbEdinburghGroupGanga

As an example, I'm going to show the syntax of a Condor job description file and a simple shell script that executes a DaVinci job.. Without going into the specific details of the Condor job description, which you can read from the online manual, there are some parameters of interest. Those are: Universe, Executable and Queue. The rest of the parameters are self-explanatory.

# begin Condor submit file

Universe = vanilla

SUBMIT_SKIP_FILECHECK = True
Transfer_executable = False
Transfer_files = ALWAYS
Should_transfer_files = YES
When_to_transfer_output = on_exit

Copy_to_spool = False

# actual job is performed by the shell script
script_to_run = lhcb_script.sh
Transfer_input_files = $(script_to_run)
Arguments = $(script_to_run)
Executable = /bin/bash
Output = $(script_to_run).$(cluster).$(process).out
Error = $(script_to_run).$(cluster).$(process).err

Log = $(script_to_run).log
Queue 1
# end Condor submit file

In Condor terminology, the Universe is the runtime environment. In the standard universe, you can have more control on the executable in a similar way when using a debugger such as gdb. However this requires your executable being linked to Condor, and therefore this is not use for us when using the LHCb software. We need to choose the vanilla universe, the closest environment to what other batch systems do. In this universe we can run bash scripts. The rest of the options are self-explanatory.

Our executable is then the bash command with a script as argument. The script to run a DaVinci package looks like this:

#!/bin/bash

echo "Setting up LHCb software"

export HOME=/Home/aooliver
source /Disk/lochnagar0/lhcb/lhcb-soft/scripts/lhcb-condorsetup.sh

dvversion='v19r8'
source /Disk/lochnagar0/lhcb/lhcb-soft/scripts/setenvProject.sh Phys/DaVinci $dvversion
#setup environment for DV algorithm

MYALGO=/Home/aooliver/cmtuser/DaVinci'_'$dvversion/Phys/RunOnInclusive/v1r1
source $MYALGO/cmt/setup.sh
OPTS=$MYALGO/options/DVBs2JpsiPhi.opts

EXEC=DaVinci.exe
$EXEC $OPTS

echo "End."

The LHCb software environment is configured using the lhcb-condorsetup.sh script (if in bash). The rest of the script is standard procedure.

In order to submit a job to Condor, you will need to connect first to ph-condor.ph.ed.ac.uk and from your working directory use the condor_submit command:

% condor_submit condor_lhcb.job

This will submit the job to Queue 1 (the only one available). You can check the status of your job by calling the condor_q. When the job is done, the produced files will be copied to the working directory from where the job was submitted. An email will be sent by the Condor system.

Preventing core dumps

If your file crashes, then Condor will create a core dump in the directory from which the job was submitted. This is great for debugging, but if 100 of your jobs all core dump, you will quickly run out of space. To prevent this, add this line to your condor script:

CREATE_CORE_FILES=False

Topic attachments
I Attachment History Action Size Date Who Comment
Cascading Style Sheet filecss colors.css r1 manage 3.2 K 2007-12-24 - 14:15 AndresOsorio CSS style
PNGpng note.png r1 manage 0.5 K 2007-12-24 - 14:15 AndresOsorio Note
Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2008-07-23 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback