The test jobs

PF2PAT

Description: just a standard PF2PAT configuration file, with no output (drop *). This is both cpu and IO intensive.

Tests configuration:

  • Events/Job: 5000
  • CMSSW_4_2_7
  • TFileAdaptor: default
  • sleep 60 before running the job, to monitor WN background

CPU

CSCS

TimeJob_Exe.png TimeJob_User.png TimeJob_Sys.png TimeJob_CpuPercentage.png

PSI

PSI_TimeJob_Exe.png PSI_TimeJob_User.png PSI_TimeJob_Sys.png PSI_TimeJob_CpuPercentage.png

NET

CSCS

WN_net_RX_kBs_12.png

PSI

On some WN, there's already NET load, recorded by the sleep 60 WN_net_RX_kBsWN_net_Seconds.png

OLD STUFF, DISCONTINUED

Setting up the test

download the test jobs:

 cd src/
 mkdir Tests
 cd Tests/
 cvs co -d TestJobs UserCode/leo/Utilities/PerfToolkit/TestJobs/
 cd TestJobs/
 scram b

Testing

CRAB+CMSSW+SYS

Check that in your test dir the following files are available: jobscript_cmssw.sh  net.sh

Then, create a crab cfg like:

[CRAB]
jobtype = cmssw
scheduler = glite
[CMSSW]
datasetpath=/RelValProdTTbar/JobRobot-MC_3XY_V24_JobRobot-v1/GEN-SIM-DIGI-RECO
pset=JPE.py
total_number_of_events=-1
events_per_job = 50000
output_file =  cmssw_net.log, cmssw_vmstat.log, cmssw.xml, cmssw.stdout
[USER]
return_data = 1
ui_working_dir = Site.T2_CH_CSCS-Cfg.JPE-Dataset.RelValProdTTbarJobRobotMC_3XY_V24_JobRobotv1-EventsJo\
b.50000-Sw.CMSSW_3_6_0_pre5-Date.201005041858
additional_input_files = net.sh
script_exe=jobscript_cmssw.sh
copy_data = 0
publish_data=0
publish_data_name = name_you_prefer
[GRID]
rb = CERN
se_black_list = T0,T1

If you want to use some scripts, use the crab.template file, modify it accordingly to the example before (correction) and the crab_LaunchIOTestJobs.py file, like:

crab_LaunchIOTestJobs.py T3_CH_PSI CMSSW_3_7_0_pre4_Brian2nd JPE.py 50000 /RelValProdTTbar/JobRobot-MC_3XY_V24_JobRobot-v1/GEN-SIM-DIGI-RECO
 

TODO: update the script on CVS

CMSSW+SYS

If you want to test single jobs (eg from a dedicated WN), then crab is not an option. Use, instead, the jobscript_standalone_cmssw.sh script, eg:

./jobscript_standalone_cmssw.sh JPE

please check the EVENTS and SW variables:

$ cat jobscript_standalone_cmssw.sh

#!/bin/bash

CFG=$1
SW=CMSSW_3_7_0_pre4_Brian2nd
LOG="cmssw"
EVENTS=50000

DIR=Site.T3_CH_PSI-Cfg.${CFG}-Dataset.RelValProdTTbarJobRobotMC_3XY_V24_JobRobotv1-EventsJob.${EVENTS}-Sw.${SW}-Date.`date +%Y%m%d%H%M`-Label.SingleJob
mkdir $DIR

#eval `scram ru -sh`

vmstat -nd 10 &> ${DIR}/${LOG}_vmstat_1.log  &
PIDSTAT=$!
./net.sh ${DIR}/${LOG}_net_1.log &
PIDWATCH=$!
sleep 60
( /usr/bin/time cmsRun -j ${DIR}/${LOG}_1.xml ${CFG}.py ) &> ${DIR}/${LOG}_1.stdout
kill -9 $PIDSTAT $PIDWATCH

If you want to rune more jobs, you can use something like this:

#!/bin/bash

CFG=$1
SW=CMSSW_3_8_0_pre1
LOG="cmssw"
START=1
END=10

DIR=Site.T3_CH_PSI-Cfg.${CFG}-Dataset.MinimumBiasCMSSW_3_8_0_pre1-GR_R_37X_V5_RelVal_col_10-v1-EventsJob.10000-Sw.${SW}-Date.`date +%Y%m%d%H%M`
mkdir $DIR

for i in `seq ${START} ${END}`; do
    vmstat -nd 10 &> ${DIR}/${LOG}_vmstat_${i}.log  &
    PIDSTAT=$!
    ./net.sh ${DIR}/${LOG}_net_${i}.log &
    PIDWATCH=$!
    sleep 60
    ( /usr/bin/time cmsRun -j ${DIR}/${LOG}_${i}.xml ${CFG}.py ) &> ${DIR}/${LOG}_${i}.stdout
    kill -9 $PIDSTAT $PIDWATCH
done

Analyzing the results

First of all, download the proper scripts:

cvs co -d PerfToolKit UserCode/leo/Utilities/PerfToolkit/cpt_getJobInfo.py
cvs co -d PerfToolKit UserCode/leo/Utilities/PerfToolkit/cpt_getStats.py
cvs co -d PerfToolKit UserCode/leo/Utilities/PerfToolkit/cpt_utilities.py
cvs co -d PerfToolKit/plugins UserCode/leo/Utilities/PerfToolkit/plugins

Example 1: CRAB+CMSSW+SYS

The first step is to create the rootfiles containing the needed information:

$ python PerfToolKit/cpt_getJobInfo.py --type=CMSSWCRAB Site.T2_CH_CSCS-Cfg.JPE-Dataset.RelValProdTTbarJobRobotMC_3XY_V24_JobRobotv1-EventsJob.50000-Sw.CMSSW_3_7_0-Date.201006061948

--type identifies the workflow you've used. It can take the values:

  • CRAB: a CMSSW job sent through CRAB
  • CMSSW: a CMSSW job executed stand alone
  • CMSSWCRAB: a stand-alone CMSSW jobs executed (through a script) with CRAB

Then, we need to create the tables and the graphs. The script is cpt_getStats.py and takes the following arguments:

  • The list of rootfiles (separated by a space) which contains the information. * and ? wildcards are supported
  • = --save-png= : Saves created histos in png format
  • --save-root: Saves created histos in a ROOT file. If enabled, these histos will be not drawn on screen
  • --no-auto-bin: Disables automatic histo binning
  • --binwidth-time=BINWIDTHTIME: Bin width of time histos in seconds
  • --no-plots: Do not draw plots, only outputs the summary tables
  • --label=LABEL: Label to be used in naming plots, etc
  • --mode=MODE: Preconfigured modes for analysis: SiteMon, SiteMonExt, SiteCfrExt, Default (default value smile ). This drives which quantities are examinated and the output style

For example, to perform a Site monitoring during time:

$ python PerfToolKit/cpt_getStats.py --mode=SiteMonExt *CSCS*.root

The behaviour of different "modes" can be configured in the setCPTMode(mode) function defined in cpt_utilities.py. Warning: some histograms may be not plotted when the contained values are too small (e.g. User time~ 10 secs). You can try setting a more fine-grained binwidth, e.g. --binwidth-time=5

-- LeonardoSala - 31-Aug-2010

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng CMSSW_CpuPercentage.png r1 manage 6.2 K 2011-11-10 - 11:58 UnknownUser  
PNGpng PSI_TimeJob_CpuPercentage.png r1 manage 7.8 K 2011-11-10 - 13:50 UnknownUser  
PNGpng PSI_TimeJob_Exe.png r1 manage 7.1 K 2011-11-10 - 13:51 UnknownUser  
PNGpng PSI_TimeJob_Sys.png r1 manage 7.2 K 2011-11-10 - 13:50 UnknownUser  
PNGpng PSI_TimeJob_User.png r1 manage 6.9 K 2011-11-10 - 13:50 UnknownUser  
PNGpng TimeJob_CpuPercentage.png r1 manage 6.3 K 2011-11-10 - 11:59 UnknownUser  
PNGpng TimeJob_Exe.png r1 manage 6.4 K 2011-11-10 - 11:58 UnknownUser  
PNGpng TimeJob_Sys.png r1 manage 6.3 K 2011-11-10 - 11:59 UnknownUser  
PNGpng TimeJob_User.png r1 manage 6.2 K 2011-11-10 - 11:58 UnknownUser  
PNGpng WN_net_RX_kBsWN_net_Seconds.png r1 manage 62.2 K 2011-11-10 - 13:49 UnknownUser  
PNGpng WN_net_RX_kBs_12.png r1 manage 66.6 K 2011-11-10 - 12:01 UnknownUser  
Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2011-11-10 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback