Configuration: Dataset discovery and job configuration

Introduction

Data discovery and job configuration is one of the most important steps in using CRAB as it defines on which dataset your jobs will run. The following describes the data discovery and CRAB configuration concerning dataset selection. It also describes how to control at which GRID site the jobs will run.

Data discovery

CMS provides the user with a Dataset discovery service at http://cmsdbs.cern.ch/discovery/. Various options are available to narrow down the selection of datasets. More information can be found at [https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookDataSamples.

The result of the selection process is a single datasetpath in the format /<dataset>/<tier>/<processed dataset> and a list of storage elements where this dataset is available. Enter the datasetpath into the datasetpath field in the [CMSSW] section of your CRAB configuration file.

CRAB will automatically configure your jobs to run over the correct files and will use one of the available GRID sites to run them.

Control at which GRID site the jobs will run

To control at which GRID site the jobs will run, you can use the list of storage elements from the discovery service. You can define in the [EDG] section sites which the job should use exclusively to run your jobs by specifying a comma-separated list of the corresponding storage elements:

se_white_list = cmssrm.fnal.gov,srm.cern.ch

On the other hand, you can exclude sites from running your jobs by providing a comma-separated list of corresponding storage elements in the [EDG] section:

se_black_list = cmssrm.fnal.gov,srm.cern.ch

You can further narrow down the site selection by using additional compute element criteria:

ce_white_list = cmslcgce.fnal.gov

ce_black_list = cmslcgce.fnal.gov

Note: the Condor-G direct submission mode requires one and only one storage element selected by the se_white_list parameter. If there are more compute elements associated to a storage element, the user has to specify one using the ce_white_list parameter.

Software availability

A big requirement for successful CRAB submission is the availability of the used software version. To check if there are sites which have the software version used by the user installed, use the following sites:

Previous: Input and Output handling Top: Main page Next: Example CRAB configuration file
Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2007-03-27 - OliverGutsche
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback