Some thoughts about the 2011 and 2012 data reprocessing campaigns

Scope

This twiki contains information on the operations of the data reprocessing campaigns of 2012 and 2011 data carried out in fall 2012 and winter/spring 2013.

Operational Setup

Attaching T2 sites

Several T2 sites have been attached to T1 storage elements during these campaigns. They were

  • downloading a RAW file to the local worker node
  • processing the RAW with standard Brunel and DaVinci/DQ steps
  • uploading the FULL.DST to the same storage element the RAW came from

The upload to the same storage element is not absolutely necessary though was kept to allow the granularity of runs respected.

The list of attached sites is extracted from the CS and put into a table at http://lhcbproject.web.cern.ch/lhcbproject/Reprocessing/sites.html , e.g. some snapshot during operation below

attachedT2s.png

CS entries for attaching at T2 site

  • Upload of the output file
    • Resources -> Sites -> LCG -> -> AssociatedSEs : Tier1-BUFFER = [T1]-BUFFER (e.g. CNAF-BUFFER)
  • Download of the input file
    • Operations -> LHCb-Productions -> SiteLocalSEMapping : [Site] = [T1]-RAW (e.g. LCG.CBPF.br = CNAF-RAW)
  • JobLimits (limit the number of concurrently running jobs)
    • Operations -> Defaults -> JobScheduling -> RunningLimit -> [Site] -> JobType : User = 0, DataStripping = 0, DataReconstruction = 0, DataReprocessing = (depends on the size of the site)
  • MatchingDelay (set a delay between two jobs of a given type matched at a site, will help not to overload the storage and network for RAW download)
    • Operations -> Defaults -> JobScheduling -> MatchingDelay -> [Site] -> JobType : DataReprocessing = (e.g. 60, usually 30 - 60 (seconds))

Operations during the campaign

Setup of new productions

During 2012 reprocessing new conditions were made available from time to time, this made it necessary to create new productions. In the end 2012 reprocessing was done with ~ 12 different productions.

2011 data reprocessing was carried out with only two sets of productions (all conditions were known), one for MagUp one for MagDown.

In general it is better to keep the number of productions to a bare minimum as this will simplify the closing of productions (see below).

Extending run ranges

If a production was eligible to run on several 10 thousands of files the run ranges for these productions were limited, in order not to overload the dirac agents with too many requests. The staging itself at the sites is handled by the "slicing" of staging into 12 chunks over the day. This has worked out properly. The extension of a production was usually done when the staging or waiting jobs for one site were close to be finished. The extension was usually in the order of 15-20 k files (to be checked with e.g. BK stats).

Re-attaching T2 sites

For 2012 data reprocessing it was tried to attach T2 sites according to their rebus capacities to T1 storages in order to have approximately the same CPU power per "cloud". It turned out that some sites (especially T1s) were providing much more CPU than published. Therefore the sites had to be re-attached to other storage elements.

Also during re-processing campaigns sites were continuously re-attache to other storage elements. This was possible because of the FULL.DST format which contains all necessary information for stripping and therefore the reco output does not need to be uploaded to the "corresponding" RAW file location. (previously stripping needed to download the reco/DST file + the corresponding RAW file to process the data).

Re-attaching a site also means that all currently processed files will be uploaded to the new storage element, so FULL.DSTs of a given run may be split between to sites. This will produce in the end small merged DST files but was taken into account. Also note that the reattachment of a T2 site needs to be done BEFORE a new production is launched or a run range is extended. The reason being that all already created JDLs will not be re-written and will go to the sites where they were originally attached to.

Also the fact that there are several places where T2 sites need to be attached to T1 storage (Resources, Operations) is not a comfortable situation and should be improved.

Controlling the amount of data staged

In order not to overload the tape storages and to create input data resolution for reco input files it is useful to limit the amount of data staged per 24 hours. This is done via CS entries in the section "Resources -> StorageElements -> [Site]-RAW" by setting the value of option "!DiskCacheTB". The value is the TBs of the disk cache in front of the tape system. This value will be divided by 12 and over one day every 2 hours a 1/12 of the value of data shall be staged.

Some storage elements (e.g. CNAF) are more flexible where this value can easily be boosted to some very high value in order to stage a lot of data at once, then the limit is the stager system itself.

Monitoring

additional monitoring pages and tools that have been setup for reprocessing campaigns

Evolution of reprocessing

Two pages at http://lhcbproject.web.cern.ch/lhcbproject/Reprocessing/stats.html which are extracting the number of reconstructed files and the pb-1 reconstructed and merged by physics stream. The script is run under "lbdirac" (see acron there) credentials providing some json output consumed by a google chart

progress-files.png progress-pb.png

Status of reprocessing

Details on the number of files, runs, pb-1 reconstructed, stripped, merged by polarity and progress of the last 24 hours. The script was developed by Philippe. see also e.g. http://lhcbproject.web.cern.ch/lhcbproject/Reprocessing/stats.html

progress.png

Problems

Priorities

Handling of priorities was a major problem during these campaigns which involved a lot of manual operation especially in the dirac/CS. The problems were

  • Priorities between JobTypes (Reco,Simu) were not respected. The reason being that the reco jobs need to be run with a "Matching Delay", i.e. only every X seconds a reco job is allowed to match at a T2 site in order not to overload the network and/or stroage. In principal MC jobs have a lower priority, but inbetween two matched reco jobs N MC were matched. Therefore using up the job slots at a given site. * Priorities beween Reco and Stripping were not respected. Although DataStripping was submitted with priority 5 (reco with priority 2) the stripping jobs were accumulating at the T1 sites. Therefore reco had to be restricted.

Production Closing

Closing of productions is extremely tedious and is being handled by several Jira sprints (see descriptions of problems there).

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng attachedT2s.png r1 manage 206.2 K 2013-02-21 - 15:57 StefanRoiser  
PNGpng progress-files.png r1 manage 74.9 K 2013-02-26 - 15:38 StefanRoiser  
PNGpng progress-pb.png r1 manage 156.7 K 2013-02-26 - 15:38 StefanRoiser  
PNGpng progress.png r2 r1 manage 216.7 K 2013-02-26 - 15:44 StefanRoiser  
Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2013-02-26 - StefanRoiser
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback