Main Web>TWikiUsers>OliverGutsche>CRAB>CRABConfigurationDataset (2007-03-27, OliverGutsche)

Configuration: Dataset discovery and job configuration

Contents:

Configuration: Dataset discovery and job configuration

Introduction

Data discovery and job configuration is one of the most important steps in using CRAB as it defines on which dataset your jobs will run. The following describes the data discovery and CRAB configuration concerning dataset selection. It also describes how to control at which GRID site the jobs will run.

Data discovery

CMS provides the user with a Dataset discovery service at http://cmsdbs.cern.ch/discovery/. Various options are available to narrow down the selection of datasets. More information can be found at [https://twiki.cern.ch/twiki/bin/view/CMS/WorkBookDataSamples.

The result of the selection process is a single datasetpath in the format /<dataset>/<tier>/<processed dataset> and a list of storage elements where this dataset is available. Enter the datasetpath into the datasetpath field in the [CMSSW] section of your CRAB configuration file.

CRAB will automatically configure your jobs to run over the correct files and will use one of the available GRID sites to run them.

Control at which GRID site the jobs will run

To control at which GRID site the jobs will run, you can use the list of storage elements from the discovery service. You can define in the [EDG] section sites which the job should use exclusively to run your jobs by specifying a comma-separated list of the corresponding storage elements:

se_white_list = cmssrm.fnal.gov,srm.cern.ch

On the other hand, you can exclude sites from running your jobs by providing a comma-separated list of corresponding storage elements in the [EDG] section:

se_black_list = cmssrm.fnal.gov,srm.cern.ch

You can further narrow down the site selection by using additional compute element criteria:

ce_white_list = cmslcgce.fnal.gov

ce_black_list = cmslcgce.fnal.gov

Note: the Condor-G direct submission mode requires one and only one storage element selected by the se_white_list parameter. If there are more compute elements associated to a storage element, the user has to specify one using the ce_white_list parameter.

Software availability

A big requirement for successful CRAB submission is the availability of the used software version. To check if there are sites which have the software version used by the user installed, use the following sites:

Official page: http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab/cmsCE.html
Private monitoring page: http://home.fnal.gov/~burt/all_cmssoft.html

Previous: Input and Output handling

Top: Main page

Next: Example CRAB configuration file

Topic revision: r2 - 2007-03-27 - OliverGutsche

Main

Webs

Public webs

Welcome Guest

- Cern Search
- TWiki Search
- Google Search
Main All webs

Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback