USCMS T3 Xrootd Architecture
This page describes a specialized deploy of Xrootd for USCMS T3 sites. The goal of this configuration is to allow the T3 site run without any dedicated data management services (no
PhEDEx is needed) by integrating with the global Xrootd data access architecture.
The T3 service and the global service are part of a fledgling demonstrator project. Those participating may need to provide lots of feedback to the developers, may experience many problems, or have to upgrade constantly
The T3 architecture has three types of servers:
- The T3 redirector. This serves as the headnode of the T3 storage cluster.
- A highly reliable local disk server (may be a RAID1 NFS server or a reliable storage element, such as HDFS). This will hold the non-CMS, local files at your T3. If you lose this server, you cannot recover your files. Ideally, this would be backed up elsewhere.
- One or more cache servers. These will hold all the CMS experimental files that can be re-downloaded from elsewhere.
How it works
Here's the process the cluster goes through to service a read operation:
- A file is opened in ROOT or CMSSW; the xrootd client sends out a request to the redirector.
- The redirector will then query all the disk servers at the T3 to see if the file is found locally.
- If the file is at a local disk server, the xrootd client is redirected to the local disk server, which starts streaming the data to ROOT.
- If the file is not found locally, the redirector will select a cache server to download the file to the T3.
- The cache server acts as a client for the global xrootd service and downloads the file. If multiple T0/T1/T2 sites host the file, the cache server will download from all of the sites using bittorrent-like logic.
- If the cache server successfully downloads the missing file, it will start serving the file to the client.
When a write occurs, the client is redirected to the local disk server.
Note that the redirector keeps no permanent state information - just a memory cache of file locations. This prevents it from keeping a consistent namespace of the available files (it's more like a distributed system like DNS than a normal file system). We recommend mounting the local disk server on login nodes so users can utilize traditional POSIX tools (ls, cp, mv, etc) to manage their files.
The image below illustrates how the proposed T3 architectures work:
Installation and Configuration
This section will be walking you through the process of installing an Xrootd cluster at your T3 site and integrating it with the global system.
Hostnames
Throughout the install document, we will use these hostnames to refer to different hosts in the system. You will need to replace these with the correct ones for your site:
- redirector.example.com: The Xrootd redirector, which does not serve data and is the "headnode" for your cluster.
- nfs.example.com: This is the Xrootd data server which will be storing unique local data.
- cache01.example.com: This is one of the Xrootd data servers which will be caching CMS experiment data.
Install
First, install the OSG Xrootd repository:
if [ ! -e /etc/yum.repos.d/osg-xrootd.repo ]; then
rpm -Uvh http://newman.ultralight.org/repos/xrootd/x86_64/osg-xrootd-1-1.noarch.rpm
fi
Then, install Xrootd using yum. This will add the
xrootd
user if it does not already exist - ROCKS users might want to create this user beforehand.
yum install xrootd
The version should be at least 1.4.2-4. If the node does not already have CA certificates and
fetch-crl
installed, you can also do this from the OSG Xrootd repo:
yum install fetch-crl osg-ca-certs
Configuration
Copy the sample T3 configuration file over to
/etc/xrootd/xrootd.cfg
:
cp /etc/xrootd/xrootd.sample.t3.cfg /etc/xrootd/xrootd.cfg
Edit the top four lines:
set t3_redirector = redirector.example.com
set global_redirector = xrootd.unl.edu
set monitoring_host = $global_redirector
set storage_directory = /mnt/raid1
Almost all sites can keep
global_redirector
and
monitoring_host
the same, but most will need to edit their own
t3_redirector
and, for the cache servers, the
storage_directory
. The
storage_directory
is the directory where files will be written.
Finally, create a copy of the host certs to be xrootd service certs:
mkdir -p /etc/grid-security/xrd
cp /etc/grid-security/hostcert.pem /etc/grid-security/xrd/xrdcert.pem
cp /etc/grid-security/hostkey.pem /etc/grid-security/xrd/xrdkey.pem
chown xrootd: -R /etc/grid-security/xrd
chmod 400 /etc/grid-security/xrd/xrdkey.pem # Yes, 400 is required
Operation
Xrootd is operated as a traditional Unix init service. As root, on all nodes, you need to start both
cmsd
and
xrootd
:
service cmsd start
service xrootd start
These daemons will drop privilege to the user
xrootd.
On the cache servers, you will also need to start the
frm_xfrd
daemon:
service frm_xfrd start
Once you are ready, on the cache servers, you can start the
frm_purged
daemon. This service will purge old files from the cache.
DO NOT RUN THIS ON A SERVER WITH LOCAL FILES, as it will delete the local files in addition to the experiment files.
service frm_purged start
The logfiles are kept in:
/var/log/xrootd/xrootd.log
/var/log/xrootd/cmsd.log
The typical init commands should work (start, stop, restart, status, condrestart).
Port usage:
The following information is probably needed for sites with strict firewalls:
- The xrootd server listens on TCP port 1094.
- The cmsd server needs outgoing TCP port 1213 to xrootd.unl.edu. On the redirector, it also needs incoming TCP port 1213.
- Usage statistics are sent to xrootd.unl.edu on UDP ports 3333 and 3334.