Main Web>TWikiUsers>DanielVanDerSter>GangFuPanda (2010-01-08, WolfgangWalkowiak)

pathena usage

usage: pathena [options] <jobOption1.py> [<jobOption2.py> [...]]

'pathena --help' prints a summary of the options

options:

Option	Description	Ganga can do this right now?
-h, --help	show this help message and exit	yes
--split=SPLIT	Number of sub-jobs to which a job is split	yes
--nFilesPerJob=NFILESPERJOB	Number of files on which each sub-job runs	yes
--nEventsPerJob=NEVENTSPERJOB	Number of events on which each sub-job runs	?
--site=SITE	Site name where jobs are sent (default:ANALY_BNL_ATLAS_1	yes
--inDS=INDS	Name of an input dataset	yes
--minDS=MINDS	Dataset name for minimum bias stream	?
--nMin=NMIN	Number of minimum bias files per one signal file	?
--cavDS=CAVDS	Dataset name for cavern stream	?
--nCav=NCAV	Number of cavern files per one signal file	?
--libDS=LIBDS	Name of a library dataset	sortof
--beamHaloADS=BEAMHALOADS	Dataset name for beam halo A-side	?
--beamHaloCDS=BEAMHALOCDS	Dataset name for beam halo C-side	?
--nBeamHaloA=NBEAMHALOA	Number of beam halo files for A-side per sub job	?
--nBeamHaloC=NBEAMHALOC	Number of beam halo files for C-side per sub job	?
--beamGasHDS=BEAMGASHDS	Dataset name for beam gas Hydrogen	?
--beamGasCDS=BEAMGASCDS	Dataset name for beam gas Carbon	?
--beamGasODS=BEAMGASODS	Dataset name for beam gas Oxygen	?
--nBeamGasH=NBEAMGASH	Number of beam gas files for Hydrogen per sub job	?
--nBeamGasC=NBEAMGASC	Number of beam gas files for Carbon per sub job	?
--nBeamGasO=NBEAMGASO	Number of beam gas files for Oxygen per sub job	?
--outDS=OUTDS	Name of an output dataset. OUTDS will contain all output files	yes
--destSE=DESTSE	Destination strorage element. All outputs go to DESTSE (default :%BNL_ATLAS_2)	yes
--nFiles=NFILES, --nfiles=NFILES	Use an limited number of files in the input dataset	yes
--nSkipFiles=NSKIPFILES	Skip N files in the input dataset	no
-v	Verbose	trivial
-l, --long	Send job to a long queue	trivial
--blong	Send build job to a long queue	trivial
--cloud=CLOUD	cloud where jobs are submitted (default:US)	trivial
--noBuild	Skip buildJob	yes
--individualOutDS	Create individual output dataset for each data-type. By default, all output files are added to one output dataset	no
--noRandom	Enter random seeds manually	trivial
--memory=MEMORY	Required memory size	trivial
--official	Produce official dataset	no
--extFile=EXTFILE	pathena exports files with some special extensions (.C, .dat, .py .xml) in the current directory. If you want to add other files, specify their names, e.g., data1,root,data2.doc	yes
--extOutFile=EXTOUTFILE	define extra output files, e.g., output1.txt,output2.dat	yes
--supStream=SUPSTREAM	suppress some output streams. e.g., ESD,TAG	?
--noSubmit	Don't submit jobs	yes
--tmpDir=TMPDIR	Temporary directory in which an archive file is created	trivial
--shipInput	Ship input files to remote WNs	?
--noLock	Don't create a lock for local database access	n/a
--fileList=FILELIST	List of files in the input dataset to be run	?
--dbRelease=DBRELEASE	DBRelease or CDRelease (DatasetName:FileName). e.g., do.000001.Atlas.Ideal.DBRelease.v050101:DBRelease-5.1.1.tar.gz	?
--addPoolFC=ADDPOOLFC	file names to be inserted into PoolFileCatalog.xml except input files. e.g., MyCalib1.root,MyGeom2.root	?
--skipScan	Skip LRC/LFC lookup at job submission	n/a
--inputFileList=INPUTFILELIST	name of file which contains a list of files to be run in the input dataset	no
--removeFileList=REMOVEFILELIST	name of file which contains a list of files to be removed from the input dataset	no
--corCheck	Enable a checker to skip corrupted files	n/a
--prestage	EXPERIMENTAL : Enable prestager. Make sure that you are authorized	no
--useAMGA	use AMGA for location lookup	n/a
--voms	use VOMS extensions	?
--ara	use Athena ROOT Access	yes
--araOutFile=ARAOUTFILE	define output files for ARA, e.g., output1.root,output2.root	yes
--trf=TRF	run transformation, e.g. --trf "csc_atlfast_trf.py %IN %OUT.AOD.root %OUT.ntuple.root -1 0"	? AtlasAthenaMC?
--notSkipMissing	If input files are not read from SE, they will be skipped by default. This option disables the functionality	n/a
--burstSubmit=BURSTSUBMIT	Please don't use this option. Only for site validation by experts	GangaRobot
--inputType=INPUTTYPE	File type in input dataset which contains multiple file types	?
--mcData=MCDATA	Create a symlink with linkName to .dat which is contained in input file	?
--pfnList=PFNLIST	Name of file which contains a list of input PFNs. Those files can be un-registered in DDM	?
-c COMMAND	One-liner, runs before any jobOs	no
-p BOOTSTRAP	location of bootstrap file	?

pathena flow

import libs
import Client
- get SiteSpecs
set defaults
parse options
sanity check options:
- require outDS, split>=0, nEventsPerJob not used when nFilesPerJob
- set site=AUTO if cloud specified
- start preparing file list: filelist, inputFileList, removeFileList
- check grid proxy
  - get DN
- verify outDS matches DN and format
- check if outDS is unique (and check shadowDS also)
parse cmt environment: cmt show projects
- gets athenaVer, groupArea, cacheVer, nightVer
get run directory
get job options file
use runBrokerage to choose site (if site==AUTO and inDS not specified)
correct site (add ANALY_) and destSE (destSE=site if destSE=='')
extract run configuration
- uses ConfigExtractor to determine output files
- verify that there are output files.
archive sources, InstallArea
- send to panda
get outDS location if it already exists
handle input files:
- get list of files: fileList
  - if inDS: runBrokerage against list of sites holding the inDS
  - if pfnlist: add the pfns
  - if shipFiles: fileList=shipFiles
  - do splitting of input
- handle other types: cavernList, minbiasList, eamHaloAList, beamHaloCList, beamGasHList, beamGasCList, beamGasOList
get DB datasets
index the output files
submit the job
- build job and subjobs if splitting
- attach input and output file descriptions
- create outDS and libDS
- submit the job to panda
log job in local record db for pathena_util

Panda Client functions

Client.py - provides functions:

    def _x509():
        def __init__(self):
        def get(self,url,data):
        def post(self,url,data):
        def put(self,url,data):
        def convRet(self,ret):
    def getSiteSpecs():
    def getLRC(site):
    def getLFC(site):
    def getSE(site):
    def submitJobs(jobs,verbose=False):
    def getJobStatus(ids):
    def killJobs(ids):
    def reassignJobs(ids):
    def queryPandaIDs(ids):
    def queryLastFilesInDataset(datasets,verbose=False):
    def putFile(file,verbose=False):
    def deleteFile(file):
    def queryFilesInDataset(name,verbose=False,v_vuids=None):
    def getDatasets(name,verbose=False):
    def addDataset(name,verbose=False):
    def getElementsFromContainer(name,verbose=False):
    def convSrmV2ID(tmpSite):
    def getLocations(name,fileList,cloud,woFileCheck,verbose=False,useAMGA=False):
    def eraseDataset(name,gridSrc,verbose=False):
    def nEvents(name, verbose=False, askServer=True, fileList = {}, scanDir = '.'):
    def _getPFNsLRC(lfns,dq2url,verbose):
    def getMissLFNsFromLRC(files,url,verbose=False):
    def _getPFNsLFC(fileMap,site,explicitSE,verbose=False):
    def getMissLFNsFromLFC(fileMap,site,explicitSE,verbose=False):
    def _getGridSrc():
    def getDN(origString):
        def __init__(self):
        def getMap(self):
        def handle_data(self, data):
        def handle_starttag(self, tag, attrs):
        def handle_endtag(self, tag):
    def getJobStatusFromMon(id,verbose=False):
    def runBrokerage(sites,atlasRelease,cmtConfig=None,verbose=False):
    def isExcudedSite(tmpID):

-- DanielVanDerSter - 30 Jul 2008

Topic revision: r2 - 2010-01-08 - WolfgangWalkowiak

Main

Webs

Public webs

Welcome Guest

- Cern Search
- TWiki Search
- Google Search
Main All webs

Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback