Dynamic Data Management
A few notes on DDM for our codefest...
CRAB2/3 popularity packet structure:
From Brian and Marco, I get the following dashboard popularity report example:
{
'Basename':'/store/data/Run2012D/DoubleMuParked/AOD/22Jan2013-v1',
'inputFiles':'/30002/7EBCF273-2D84-E211-8075-485B39800B83.root::1::EDM::Local::4;/30002/62388A55-3184-E211-AC40-90E6BA19A245.root::1::EDM::Local::6;/30001/4C92379E-1B83-E211-9491-20CF305B0524.root::1::EDM::Local::2;/30002/1202CAF9-9684-E211-A104-20CF3027A5E5.root::1::EDM::Local::7;/30001/58DF66C9-E983-E211-83D8-00259073E4C2.root::1::EDM::Local::3;/30001/AE02E471-0F83-E211-AC3B-20CF3027A5ED.root::1::EDM::Local::1;/30002/64A9E8D1-2E84-E211-BAE0-20CF3027A630.root::1::EDM::Local::5;/30002/DA1F8660-7084-E211-913F-90E6BA442EF2.root::1::EDM::Local::8',
'BasenameParent': '',
'inputBlocks':'/DoubleMuParked/Run2012D-22Jan2013-v1/AOD#69384ec2-82eb-11e2-a89a-00221959e69e',
'parentFiles':
}
This was taken from
http://glidemon.web.cern.ch/glidemon/show.php?log=http://vocms95.cern.ch/mon/cms1925/140527_122818_crab3test-2:mmascher_crab_skim_200_preprod1/job_out.1.0.txt. Presumably other examples can be found there too.
The CRAB3 code that fills this structure is at
https://github.com/dmwm/CRABServer/blob/master/scripts/CMSRunAnalysis.py#L145.
The formats are covered somewhat in the comments:
# Now, compute the strings about the input files. Each input file has the following format in the report:
#
# %(name)s::%(report_type)d::%(file_type)s::%(access_type)s::%(counter)d
#
# Where each variable has the following values:
# - name: the portion of the LFN after the common prefix
# - report_type: 1 for a file successfully processed, 0 for a failed file with an error message, 2 otherwise
# - file_type: EDM or Unknown
# - access_type: Remote or Local
# - counter: monotonically increasing counter
Multiple files are semi-colon separated:
inputString = ';'.join(["%s::1::%s::%s::%d" % tuple(value) for value in inputInfo.values()])
Note - in CRAB3, due to current WMCore limitations, only has report_type 1. Also, all access types are currently listed as "Local", and parent information is never filled in.
The basename is the common part of the path. Say we processed /store/foo and /store/bar; then basename would be /store/ and the %(name)s referenced above would be "foo" and "bar", respectively.
--
TonyWildish - 06 Jun 2014