Legacy File Insertion Tool

Description

This tools allows the insertion of legacy files with minimal information into the standard CMS data catalogs, i.e. DBS2, DBS3 and PhEDEx. The summary of the process is as follows:

  1. A standard WMAgent is installed, per usual workflow team instructions.
  2. Using a script, which will be described in the next section, the files are inserted into the WMAgent's DBSBuffer tables.
  3. The WMAgent is started and the blocks are created/inserted with the usual machinery.

The script

The script is used to insert files into the DBSBuffer tables, the information of these files comes from a JSON file which must contain a JSON array where each element is a JSON object representing a file to be inserted, each of these objects will have the following elements:

  • dataset : The dataset path for the file, must adhere to CMS standards.
  • lfn : The logical file name
  • size : Size of the file in bytes, this field is optional.
  • events: The number of events in the file, this field is optional.
  • globalTag : The global tag for the file, this field is optional.
  • runsAndLumis : A JSON object with the run and list of lumis in the file, this field is optional.
  • checksums : A JSON object with different checksums for the file, at least a "cksum" is required. Adler32 and md5 are optional and supported.
  • location : The SE where the file is located.
  • cmssw: The CMSSW release that produced the file, this field is optional.

An example file:

[{"dataset" : "/DummyDataset1/Summer91-TripleFiltered-v1/RECO",
                       "lfn" : "/store/data/Summer91/DummyDataset1/RECO/TripleFiltered-v1/00000/0C390645-DDF3-E211-8A8E-003048F2B2C6.root",
                       "size" : 100,
                       "events" : 20,
                       "cmssw" : "CMSSW_7_0_0",
                       "checksums" : {"cksum" : 00001},
                       "globalTag" : "GT_Test_V1",
                       "location" : "srm-cms.cern.ch"},
                      {"dataset" : "/DummyDataset1/Summer91-TripleFiltered-v1/RECO",
                       "lfn" : "/store/data/Summer91/DummyDataset1/RECO/TripleFiltered-v1/00000/8C9F5C1B-E3F3-E211-871D-00A0D1EE8AF4.root",
                       "size" : 120,
                       "events" : 25,
                       "globalTag" : "GT_Test_V1",
                       "checksums" : {"cksum" : 00001},
                       "location" : "srm-cms.cern.ch"},
                      {"dataset" : "/DummyDataset2/Summer91-Processed-v2/AOD",
                       "lfn" : "/store/data/Summer91/DummyDataset2/RECO/Processed-v2/00000/8C9F5C1B-E3F3-E211-871D-00A0D1EE8AF4.root",
                       "runsAndLumis" : {"29" : [1,3,4,5],
                                         "21" : [12,31]},
                       "checksums" : {"cksum" : 00001},
                       "location" : "srm-cms.cern.ch"}]

Once the file with the information about the legacy files is ready and a WMAgent has been installed and initialized, then we can proceed to the insertion of the files into the DBSBuffer. The script is available in:

https://gist.github.com/dballesteros7/6270487

It requires 2 inputs:

  • --files: The path to the JSON file
  • --config: The path to the WMAgent configuration file, usually /data/srv/wmagent/current/config/wmagent/config.py
Additionally it has 2 configurable options:

  • --inBlock: This determines how many files per block should be allocated, defaults to 500.
  • --timeout: This determines how much time should the WMAgent wait before closing the created blocks, defaults to 16h.

The script requires WMCore libraries, for this the environment in /data/srv/wmagent/current/apps/wmagent/etc/profile.d/init.sh in a standard WMAgent installation must be sourced.

Preparation

A checklist of what is needed before using the script:

  • An installed WMAgent, using the standard instructions of the workflow team. However, the agent's components must be stopped. MySQL must be running. This agent must have valid service certificates.
  • The JSON files with the information of the legacy files, all files in a JSON file must be at the same location.
  • The aforementioned script.

Live Example

Here is the steps for a successful insertion, let's assume that the JSON file with the file information is located in /tmp/legacyFiles.json.

  1. After the WMAgent installation, keep the WMComponents shutdown. MySQL must be running, i.e. start-services
  2. Source the environment
    source /data/admin/wmagent/env.sh 
    source /data/srv/wmagent/current/apps/wmagent/etc/profile.d/init.sh
    
  3. Run the script
    python /data/admin/wmagent/FileInjection.py --config /data/srv/wmagent/current/config/wmagent/config.py --files /tmp/legacyFiles.json --inBlock 50 --timeout 3600
    
  4. Start the DBS and PhEDEx components
    $manage execute-agent wmcoreD --start --components=PhEDExInjector,DBS3Upload,DBSUpload
    
  5. Wait an hour (configured in --timeout), then all blocks must be closed and injected in DBS2, DBS3 and PhEDEx.
  6. If another list of files is to be inserted, first shutdown the components and then repeat the procedure.
Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2013-08-19 - DiegoBallesterosVillamizar
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback