Run CMSSW code using the Condor batch queue of the LPCCAF at FNAL

Introduction

You can use the LPCCAF at FNAL to parallelize your processing of accessing samples available at FNAL. You can send jobs to the LPCCAF condor queue using your code from your CMSSW project directory and write your output into your execution directory.

  • Processing batch jobs on the LPCCAF consists of the following components
    • A user CMSSW project directory
    • A user execution directory where all the output will be stored
    • A tested parameter-set
    • A script which is executed on each workernode (WN) which:
      • sets up the CMS software environment
      • sets up the user CMSSW project directory environment
      • replaces necessary entries in the parameter-set to make each job unique and stores the new parameter-set in the execution directory
      • executes cmsRun in the execution directory using the new parameter-set and stores stdout into log file
    • A condor steering file (JDL) to submit jobs to the LPCCAF batch queues

Prerequisites

  • All LPCCAAF workernodes (WN) have access to /uscms and /uscms_data/d1. Your CMSSW project directory and your execution directory have to be in one of the two.
  • Make sure that your execution directory has enough free space. Check this using the quota command.

Prepare directories

  • Use the CMSSW project directory from this tutorial as your CMSSW project directory
  • Create your execution directory replacing <user> with your username:

mkdir /uscms_data/d1/<user>/batch_tutorial

Prepare parameter-set

  • Use input dataset file dataset.cff from the DBS/DLS discovery create a parameter-set in your src directory in your local user project area:

batch.cfg

and schedule the tutorial EDProducer in it. The following template CMSSW parameter-set is valid for <= CMSSW_1_4_X:

process P =
{

  #
  # load input file
  #
  source = PoolSource
  {
    untracked vstring fileNames = {"file:test.root"}
    untracked int32 maxEvents = CONDOR_MAXEVENTS
    untracked uint32 skipEvents = CONDOR_SKIPEVENTS
  }
  include "dataset.cff"

  # include MyTrackUtility produces
  module producer = MyTrackUtility
  {
    InputTag TrackProducerTag = ctfWithMaterialTracks
  }

  #
  # write results out to file
  #
  module Out = PoolOutputModule
  {
    untracked string fileName = 'CONDOR_OUTPUTFILENAME'
  }

  path p =
  {
    producer
  }

  endpath e =
  {
    Out
  }
}

while this following template CMSSW parameter-set is valid for >= CMSSW_1_5_X:

process P =
{
  #
  # max events steering
  #
  untracked PSet maxEvents = 
  {
    untracked int32 input = CONDOR_MAXEVENTS
  }

  #
  # load input file
  #
  source = PoolSource
  {
    untracked vstring fileNames = {"file:test.root"}
    untracked uint32 skipEvents = CONDOR_SKIPEVENTS
  }
  include "dataset.cff"

  # include MyTrackUtility produces
  module producer = MyTrackUtility
  {
    InputTag TrackProducerTag = ctfWithMaterialTracks
  }

  #
  # write results out to file
  #
  module Out = PoolOutputModule
  {
    untracked string fileName = 'CONDOR_OUTPUTFILENAME'
  }

  path p =
  {
    producer
  }

  endpath e =
  {
    Out
  }
}

Prepare WN script

  • Prepare WN script which has to be in the same directory as the JDL described in the following in the src directory of your local user project area named

condor.sh

with

#!/bin/bash

#
# variables from arguments string in jdl
#
# format:
#
# 1: condor cluster number
# 2: condor process number
# 3: CMSSW_DIR
# 4: RUN_DIR
# 5: PARAMETER_SET (full path, has to contain all needed files in PoolSource and filled following variables with keywords: maxEvents = CONDOR_MAXEVENTS, skipEvents = CONDOR_SKIPEVENTS, output fileName = CONDOR_OUTPUTFILENAME)
# 6: NUM_EVENTS_PER_JOB
#

CONDOR_CLUSTER=$1
CONDOR_PROCESS=$2
CMSSW_DIR=$3
RUN_DIR=$4
PARAMETER_SET=$5
NUM_EVENTS_PER_JOB=$6

#
# header 
#

echo ""
echo "CMSSW on Condor"
echo ""

START_TIME=`/bin/date`
echo "started at $START_TIME"

echo ""
echo "parameter set:"
echo "CONDOR_CLUSTER: $CONDOR_CLUSTER"
echo "CONDOR_PROCESS: $CONDOR_PROCESS"
echo "CMSSW_DIR: $CMSSW_DIR"
echo "RUN_DIR: $RUN_DIR"
echo "PARAMETER_SET: $PARAMETER_SET"
echo "NUM_EVENTS_PER_JOB: $NUM_EVENTS_PER_JOB"

#
# setup software environment at FNAL for the given CMSSW release
#
source /uscmst1/prod/sw/cms/shrc uaf
export SCRAM_ARCH=slc4_ia32_gcc345
cd $CMSSW_DIR
eval `scramv1 runtime -sh`

#
# change to output directory
#
cd $RUN_DIR

#
# modify parameter-set
#

FINAL_PARAMETER_SET_NAME=`echo batch_${CONDOR_CLUSTER}_${CONDOR_PROCESS}`
FINAL_PARAMETER_SET=`echo $FINAL_PARAMETER_SET_NAME.cfg`
FINAL_LOG=`echo $FINAL_PARAMETER_SET_NAME.log`
FINAL_FILENAME=`echo $FINAL_PARAMETER_SET_NAME.root`
echo ""
echo "Writing final parameter-set: $FINAL_PARAMETER_SET to RUN_DIR: $RUN_DIR"
echo ""

let "skip = $CONDOR_PROCESS * NUM_EVENTS_PER_JOB"
cat $PARAMETER_SET | sed -e s/CONDOR_MAXEVENTS/$NUM_EVENTS_PER_JOB/ | sed -e s/CONDOR_SKIPEVENTS/$skip/ | sed -e s/CONDOR_OUTPUTFILENAME/$FINAL_FILENAME/ > $FINAL_PARAMETER_SET

#
# run cmssw
#

echo "run: time cmsRun $FINAL_PARAMETER_SET > $FINAL_LOG 2>&1"
cmsRun $FINAL_PARAMETER_SET >> $FINAL_LOG 2>&1
exitcode=$?

#
# end run
#

echo ""
END_TIME=`/bin/date`
echo "finished at $END_TIME"
exit $exitcode

Attention: this script is setup for SL4 releases ($ge; CMSSW_1_5_0). If you would like to use $le; CMSSW_1_4_X, please change the line:

export SCRAM_ARCH=slc4_ia32_gcc345

to

export SCRAM_ARCH=slc3_ia32_gcc323

  • Change permissions to executable

chmod 755 condor.sh

Prepare JDL

  • Prepare JDL in the local user project directory named

batch.jdl

  • Change directory names to your setup
  • Change how many events per job are processed by changing the variable in Arguments
  • Change how many jobs are submitted by changing the variable Queue

universe = vanilla
Executable = condor.sh
should_transfer_files = NO
Output = <execution directory>/batch_$(cluster)_$(process).stdout
Error  = <execution directory>/batch_$(cluster)_$(process).stderr
Log    = <execution directory>/batch_$(cluster)_$(process).condor
Requirements          = Memory >= 199 && OpSys == "LINUX" && (Arch != "DUMMY")
Arguments = $(cluster) $(process) <CMSSW project directory> <execution directory> <CMSSW project directory>/src/batch.cfg 10
Queue 10

Submission and status query

  • Submit jobs in your local user project area:

condor_submit batch.jdl

  • Query status of jobs

condor_q -submitter $USER

Check your output

Finally, check the output of your jobs in your execution directory

Previous: Write an EDProducer module to add own objects to the ROOT file Top: Main page Next: Run CMSSW code using CRAB at FNAL
Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r9 - 2007-06-26 - OliverGutsche
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback