Using the Any Data, Any Time, Anywhere (AAA) Infrastructure
This page should be considered obsolete! Please see this page instead
The service is in production, but it's still fairly new. It's not clear all the different ways someone might want to use it, but this page describes some of the more popular methods.
Keep in mind that
the AAA xrootd service is read-only. It's meant to allow you to access files, not write out results.
Which files are available?
The files you can get from this service are limited to the sites which run the service. The current participants are:
- US Region
- T1_US_FNAL (disk-only; includes test EOS service)
- T2_US_Caltech
- T2_US_Florida
- T2_US_Nebraska
- T2_US_Purdue
- T2_US_UCSD
- T2_US_Wisconsin
- T2_US_MIT
- T2_US_Vanderbilt
- EU Region
- T1_CH_CERN (EOSCMS only)
- T2_IT_Bari
- T2_IT_Pisa
- T2_IT_Legnaro
- T2_DE_DESY
- T2_UK_*
- T2_EE_Estonia
In general, any of the latest AOD is available.
For the remainder of these tutorials, we will assume that you will want to download a file with the CMS name:
/store/foo
This could be a part of a production dataset (/store/data/foo) or one of your user files at a T2 (/store/user/bbockelm/foo).
Getting the tools
In order to use Xrootd, you need to have a minimal grid environment (CA certificates and a grid certificate are needed). From lxplus, source the following script:
source /afs/cern.ch/cms/cmsset_default.sh
source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.sh
Ask your local site admin if you do not work at lxplus.
Next, setup your
ROOT or CMSSW working environment. For example, if you use CMSSW on lxplus:
cmsrel CMSSW_6_0_0
cd CMSSW_6_0_0
cmsenv
However, bare
ROOT works fine also.
Authentication
The Xrootd data service only allows GSI authentication. You must have a grid certificate installed into
~/.globus
and registered in CMS. This is covered in
Chapter 5 of the CMS workbook.
If you do not have a proxy in your environment when the job starts, xrootd will prompt you to create one.
Download whole files with command-line tools
The command-line tool for xrootd is called
xrdcp
. This command line utility ships with stand-alone
ROOT and CMSSW. Here are the steps to using it:
- Initialize your ROOT or CMSSW environment (run
cmsenv
).
- Run:
xrdcp root://xrootd.unl.edu//store/foo /some/local/path
You will get a progress bar as the file downloads. You may also want to think about using the
-R
option, which
allows you to recursively download a directory.
Open a file using ROOT
If you are using bare
ROOT, you can open files in the xrootd service just like you would any other file:
TFile::Open("root://xrootd.unl.edu//store/foo");
This returns a TFile object, and you can proceed normally.
Open a file in CMSSW
You want to edit the
PoolSource line to point directly at the xrootd service, instead of using a generic LFN.
For example, this might be the "before" picture:
process.source = cms.Source("PoolSource",
# # replace 'myfile.root' with the source file you want to use
fileNames = cms.untracked.vstring('/store/foo')
)
Here's the same file, but accessed through the Xrootd Service:
process.source = cms.Source("PoolSource",
# # replace 'myfile.root' with the source file you want to use
fileNames = cms.untracked.vstring('root://xrootd.unl.edu//store/foo')
)
Run an analysis using CRAB
When running analysis using CRAB3 or the remoteGlidein scheduler in CRAB2, if your job has been queued for more than 12 hours, it is eligible for "overflow". In such a case, your job may be run at additional US sites (currently, T2_US_Purdue, T2_US_Nebraska, T2_US_Wisconsin, or T2_US_UCSD), even if the files are not present at those sites. When run, the jobs will automatically switch to reading from the redirector.
There is nothing which needs to be done on the user side to enable overflow.
Listing the contents of a directory
LISTING CONTENTS CANNOT BE DONE IN GENERAL - in the same way that you cannot list the directory of a webpage. Use something like DBS to determine the files you need.
However, there are some special cases. If you'd like to list the contents of a directory at Nebraska, you can use the
xrd
command:
xrd red-gridftp1.unl.edu
xrd
is a FTP-like client. You can then type:
dirlist /store/data
and use the
exit
command to exit. Alternately, you can just download the contents of a whole directory:
xrdcp -R root://red-gridftp1.unl.edu//store/data/Commissioning10/MinimumBias/RECO/SD_EG-Jun14thSkim_v1/0135 /tmp
Again, this will
only work for data at Nebraska (which partially defeats the purpose of the global redirector).