AFFAIR manual for DATE users

1 Introduction

The AFFAIR[1] package monitors the DATE software as well as the system behavior of the computer cluster where DATE is running. It is installed separately from DATE and can be downloaded from the AFFAIR Web site http://www.cern.ch/affair, with the full code and detailed installation instructions.

Figure 1 gives an overview of the AFFAIR system. It extracts performance data from the monitored nodes and forwards it in fixed time intervals to the monitoring station, where the data is stored and analysed. Summary plots accesible via a Web server or the Affair Control are also created on the monitoring station. All communications and data transfer use the DIM/SMI client/server software of DATE.

The main AFFAIR processes, running on the monitored nodes are:

DATE Collector: it collects the performance data from processes running on the LDCs and GDCs. An instance of this process is started for each LDC and GDC node. The DATE Collector maps to the DATE shared memory, accesses the data (see table II) and transfers the extracted values to the Affair Monitor.
System Collector: it collects the system performance from nodes where DATE processes are running. The System Collector parses the files in the /proc directory or runs system calls, and transfers the extracted values to the AFFAIR Monitor.

The main AFFAIR processes, running on the monitoring station are:

Affair Monitor: it collects monitored data from the Collectors and places them into temporary storage databases as well as into permanent ROOT storage.
Data Processor: it analyses the stored data and produces in fixed time intervals the updated plots for the Web interface.
Affair Control: a graphical user interface which is able to configure, start, stop, control AFFAIR on the entire cluster, and to display performance graphs.
Web inteface:provides links to various performance plots and data.

The DIM package is used for the communication between the nodes:

the DATE DIM name server controls the communication between the AFFAIR Monitor and nodes for DATE processes (DATE Collector).
the AFFAIR DIM name server controls the communication between the AFFAIR Monitor and the nodes where the System Collector is running. It can be different from the DATE DIM name server, so that system performance monitoring can be continued, even when all DATE processes are shut down.

Figure 1:Overview of the Affair structure

2 Installation of the monitoring station

The monitoring station is the host where data collection, storage and analysis of performance data takes place and where the Web interface is located. The following packages must be installed on the monitoring station:

ROOT[2].
DIM [3] and SMI [3].
the Apache Web server with PHP support and source code [3], [4].

The steps involved in the installation are the following:

creation of the Apache Web server with PHP support
declaration of environment variables
installation of the actual AFFAIR code
linking of the Web server to the AFFAIR web interface

The AFFAIR user should also start the DIM name server on both the DIM_DNS_NODE computer (the standard DATE DIM name server) and the AFFAIR_DNS_NODE computer (the AFFAIR specific DIM name server). Note that for the simplicity these two name servers can be the same.

1 Apache Web server with PHP support installation

The system configuration on the monitoring station may already have a combination of apache/PHP running. To enable AFFAIR installation without interferring with existing web server setups, several web server installation scenarios are considered:

no Apache Web server running.
Apache Web server running with PHP and no PHP source code.
Apache Web server running with PHP and PHP source code.

In all cases the path/name of the PHP configuration file has to be declared via the environment variable PHPCONFIG, which is required by the AFFAIR Web interface. In systems where PHP code comes with a preinstalled Apache Web server, this file is usually /usr/local/bin/php-config. Consider these cases in more detail:

1 No apache web server

if the monitoring station has no Web server, the simplest procedure is to download the Apache and PHP code from the AFFAIR site, and then run as root the Web installation script in $AFFAIR_MONITOR. For example,

source installWebPhp.sh apache_1.3.29 php-4.3.2

installs the Web server with PHP support in the /usr/local/apache directory (the default Apache directory) and configures the Apache configuration file httpd.conf to be able to run PHP scripts. It also adds some internet security features to httpd.conf. The default path of the php-config is /usr/local/bin/php-config.

2 Apache with PHP, but no PHP source code

Often the PHP source code is not included in a distribution with a functioning web server with PHP support. The AFFAIR site has the relevant code which can be downloaded into any directory ( e.g. /local/phptmp), and installed (e.g. in /local/php) by:

cd /local/phptmp
tar xvzf php-4.3.2.tar.gz
cd php-4.3.2
./configure -prefix=/local/php -enable-track-vars -disable-mysql -without-mysql
make
make install (as root)

The PHPCONFIG environment variable for this example is /local/php/bin/php-config.

3 Apache with PHP, and PHP source code

Nothing needs to be done.

4 starting apache

The apache daemon is started by:

/usr/local/apache/bin/apachectl start

It is recommended to start apache at boot time, by adding the following line to the file /etc/rc.d/rc.local:

[ -x /usr/local/apache/bin/apachectl ] && /usr/local/apache/bin/apachectl start

2 environment configuration

The directory where the AFFAIR code on the monitoring station will be downloaded and installed needs to be declared as AFFAIR_MONITOR. In addition, all the relevant paths and environment variables for DIM and SMI (both the standard DATE and AFFAIR specific name server), ROOT, and PHP need to be declared. The best method is to add them to the /.bashrc file. A typical example is:

# affair specific environment variables
export AFFAIR_MONITOR=/local/affair
export AFFAIR_DNS_NODE=lxs5013.cern.ch
# dim/smi specific environment variables
export DIM_DNS_NODE=lxs5013.cern.ch
export DIMDIR=/local/dim-for-affair
export DIMBIN=/local/dim-for-affair/linux
export SMIDIR=/local/smi-for-affair
export SMIBIN=/local/smi-for-affair/linux
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$DIMBIN:$SMIBIN
#directory where php configuration is contained
export PHPCONFIG=/local/php/bin/php-config

As ROOT has to be accesible also from the Apache Web server its environment variables have to be in a common location such as /etc/profile, and not in a user specific one. Thus /etc/profile should contain:

#root specific environment variables
export ROOTSYS=/local/root-for-affair
export PATH=$PATH:$ROOTSYS/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ROOTSYS/lib

3 AFFAIR instalation

The affairMonitorVERSION.tar.gz file (e.g. affairMonitor2.6.tar.gz ) needs to be downloaded to the $AFFAIR_MONITOR directory from the AFFAIR site http://www.cern.ch/affair. At this point AFFAIR can be installed as follows:

cd $AFFAIR_MONITOR
tar xvzf affairMonitorVERSION.tar.gz
make

This compiles the full AFFAIR code for the monitoring station. At this stage the Web interface can be enabled. The Web server should be linked (needed to be root) to the appropritate AFFAIR code:

ln -s $AFFAIR_MONITOR/web /usr/local/apache/htdocs/affair (if /usr/local/apache is the directory where Apache is installed)

The AFFAIR Web is accessed with:

http://mymachine.mydomain/affair/index.php (e.g.: http://pcald30.cern.ch/affair/index.php)

3 Installation of the nodes

The affairDateVERSION.tar.gz file available on the AFFAIR Web page needs to be downloaded on every node where DATE is running, and placed in the $AFFAIR directory (a usual DATE setup already defines AFFAIR as /date/affairRC). If the $AFFAIR directory is on a shared file system, this needs to be done only once.

The AFFAIR_DNS_NODE environment variable has to be declared in addition to the standard DATE environment variables.

The file $DATE_SITE_CONFIG/AFFAIR.config should be changed or created to have the content shown in table I, adjusted to the actual setup:

AFFAIR_DNS_NODE pcdaq01

DIMBIN $DIMDIR/linux

SMIBIN $SMIDIR/linux

LD_LIBRARY_PATH $LD_LIBRARY_PATH:$DIMBIN:$SMIBIN

Table I:The file $DATE_SITE_CONFIG/AFFAIR.config

After this, the code can be compiled:

cd $AFFAIR
tar xvzf affairDate2.6.tar.gz
make

(in this example the AFFAIR version is 2.6).

The AFFAIR Collectors running on the LDCs and GDCs are started by the run control via the script /date/runControl/AFDC.sh

By default, this script starts only a dummy process, so it must be changed to actually start the AFFAIR Collectors. The script shown below is suggested, but it is up to the user to write his/her own.

#!/bin/csh -f
if ( "${?DATE_ROLE}" == "0" ) setenv DATE_ROLE "UNKNOWN"
setenv L "/tmp/AFFAIR.${DATE_ROLE}.log"
chmod 0777 "$L" "$L.*"
mv -f "$L.4" "$L.5"
mv -f "$L.3" "$L.4"
mv -f "$L.2" "$L.3"
mv -f "$L.1" "$L.2"
mv -f "$L" "$L.1"
if (${DATE_ROLE} == "EDM" || ${DATE_ROLE} == "DDG") then
${DATE_RC_BIN}/dummy_collector $1
else
( $AFFAIR/startAffair.sh $1 ) & "$L"
endif
chmod 0777 "$L" "$L.*"

4 Configuration and operation

AFFAIR processes need to be started on all the monitored nodes as well as the monitoring station. The monitored nodes are enabled by the DATE runControl, where the setting of the checkbutton marked AFFAIR starts the monitoring on all the LDC and GDC nodes.

The easiest way to configure and operate AFFAIR is to use on the monitoring station the affairControl script:

cd $AFFAIR_MONITOR
./affairControl.sh

This script starts a GUI which allows easy setup and configuration of the monitored nodes, as well as viewing of all performance plots accesible via the Web interface.

Figure 2a shows the tab in in the AFFAIR Control used to modify the list of monitored nodes. The full hostname must be specified (for example, it should be pcdaq01.cern.ch and not pcdaq01).

Pressing the ``Start monitoring for nodes with DATE installed'' button in the Monitor control (figure 2b) tab starts performance data collection and storing. Afterwards the ``performance plots'' tab will provide access to graphs of the analysed data (see figure 3).

The default behavior of the AFFAIR Monitor is to write to both temporary online databases and permanent ROOT offline storage, as well as to create performance plots accessed via the Web interface. The offline storage can be turned off by clicking on the appropriate radio button in this tab. It is also possible to disable the plot creation. This can be useful if only AFFAIR Control is used to inspect the performance.

Figure 2:Affair Control: a) Tab for setting up DATE parameters and b) Tab for starting and controling AFFAIR monitoring

Figure 3:Affair Control tab giving access to all the performance plots

1 Automating the monitoring station

Cron jobs are a good way to ensure that the monitoring station processes start at boot time and that they smoothly recover from possible crashes. For example, if /local/affair is the AFFAIR_MONITOR directory, then the following cron job starts monitoring the nodes with offline storage (entered using the crontab -e shell command):

*/1 * * * * source /.bash_profile; source /.bashrc; $AFFAIR_MONITOR/startProcess.sh monitor &/dev/null &

To disable offline storage, the monitor_nopermstore parameter is used instead of monitor. The startProcess.sh script starts the AFFAIR Monitor and several instances of the Data Processor (see section V.F) with the appropriate flags.

5 Detailed description

In this section a more detailed description of AFFAIR components is presented.

1 Monitored values

The performance data is grouped into three types, named monitoring sets, depending on which Collector is used:

GDC monitoring set: data collected by the DATE Collectors on GDC machines
LDC monitoring set: data collected by the DATE Collectors on LDC machines
DATEMON monitoring set: data collected by the System Collectors on all machines

The performance data is separately analysed and plotted for each monitoring set. Tables II and III show the monitored values for the different monitoring sets. The definition of the partitions 1 to 3, of the disks 1 and 2, and of the network card is done via configuration file (see section VI.C).

ID Description

0 run number

1 total MB recorded

2 total MB injected

3 run status

4 number of events recorded

5 event count

6 data recorded rate in MB/sec

7 data injected rate in MB/sec

8 to 31 DDL id (only for LDC.conf)

32 to 55 DDL data rate (only for LDC.conf)

56 to 79 DDL fragment number rate (only for LDC.conf)

Table II: LDC and GDC monitored values

ID Description 11 Description

0 percent of partition 1 filled 11 Nice CPU as percentage of total CPU

1 percent of partition 2 filled 12 System CPU as percentage of total CPU

2 percent of partition 3 filled 13 Idle CPU as percentage of total CPU

3 amount of disk free of partition 1 in MBytes 14 bandwidth into network card in MB/sec

4 amount of disk free of partition 2 in MBytes 15 bandwidth out of network card in MB/sec

5 amount of disk free of partition 3 in MBytes 16 amount of swap free in MBytes

6 blocks/sec read from disk 1 17 amount of RAM free in MBytes

7 blocks/sec read from disk 2 18 number of sessions logged in

8 blocks/sec written to disk 1 19 pages swapped in in MBytes/sec

9 blocks/sec written to disk 2 20 pages swapped out in MBytes/sec 10 User CPU as percentage of total CPU

Table III:System monitoring

2 Web interface

The function of the AFFAIR Web interface is to provide links to plots as well as to performance data. The eps format of the plots created by the Data Processor (see section V.F) is not suitable for Web access, so the Web interface first converts the requested plots to a png format, using a system call to the $AFFAIR_MONITOR/support/convert function.

A screenshot of the AFFAIR Web interface top page is shown in Figure 4. The links on the left provide access to summary plots (see section V.C) as well as current performance data in a tabular text format. The links on the right generate detailed plots for each monitored node.

There are no histograms in the Web interface as the information from them can not be extracted from the temporary databases (so called round robin databases (RRDs). See section V.G).

On each page there is a choice of the available plot time intervals, starting from the previous 20 minutes and going up to the last month. The refresh time of the page can be chosen from the provided time intervals (between 30 seconds and infinity), see Figure 4.

Figure 4: Top page of the web interface

3 Plots

The graphs provided by the web interface are of the following types:

Global plots. These are summary plots which show performance as a function of time for a number of monitored nodes on one plot:
- superimposed: the performance data for up to 20 monitored nodes is shown superimposed on one plot, enabling a global overview of their behavior, see Figure 6. If the number of monitored nodes is more than 20, additional plots are created for readability.
- summed: the aggregate value is shown. This is very useful to measure, for example, the total GDC or network throughput as a function of time, see Figure 5.
Individual plots: they show performance as a function of time for a specific monitored node. They can show individual metrics, or several metrics can be shown on one plot. This summed option is useful to observe, for example, the user/system/nice/idleCPU values all together on one plot, see Figure 7.
Snapshot: These plots give a snapshot of all monitored nodes at the current time and are displayed in the top page of the Web interface (see Figure 4). They provide a clear overview of the entire cluster status. Examples are given in Figures 8 and Figure 9.
Status: The top page also provides the on/off status of all the monitored nodes for the selected time period, see Figure 10.

In addition to the average values, the maxima for a particular bin are drawn as superimposed dashed lines.

Figure 5:Aggregate in and out transfer rate for the past seven days

Figure 6:System CPU status for the past day for 20 nodes

Figure 7:CPU status for the past 20 minutes for one node

Figure 8:Current CPU status of all nodes. The bottom is color coded to match the node links on the right side of the AFFAIR Web pages

Figure 9:Current in and out transfer rate of all nodes

Figure 10:Status of node during the past month

4 Collectors

The Collectors are the processes which gather the performance data on each monitored node. They execute an endless loop, with the period provided by the AFFAIR Monitor. At the end of each loop the data is extracted and sent to the AFFAIR Monitor. The data is obtained in several ways:

DATE Collector gets all its data from the DATE shared memory
System Collector either parses the /proc directory, or executes system calls.

5 AFFAIR Monitor

The AFFAIR Monitor is the AFFAIR component on the monitoring station that gets performance data via DIM communication channels and stores it in temporary round robin databases (RRD) as well as in ROOT permanent storage.

When the AFFAIR Monitor is started, it reads the file $AFFAIR_MONITOR/config/computerlist.conf to get a list of all the nodes used, and subscribes to the LDC, GDC and DATEMON services on all the nodes. It then sends configuration parameters to the nodes (see section VI.C), and receives back the final parameters used (if the Collector changed them), as well as some system information, like the kernel version, CPU speed or RAM size. Several configuration and log files are updated with this data.

The AFFAIR Monitor then continuously receives (with a period specified in the parameters sent to the Collectors) performance data. The data is sent both to the appropriate RRD, and -if offline storage is chosen-, to ROOT storage files. If an RRD and ROOT file does not yet exist for a particular monitored node and monitoring set they will be created on the fly.

6 Data Processor

The Data Processor is ROOT based code to create plots for the Web interface and to perform detailed offline analysis of the monitored data. The default behavior of the monitoring station, using the startProcess.sh script, is to create several Data Processor instances (one for each monitored set).

The Data Processors loop continuously over all the RRDs, generate global plots and place them in the $AFFAIR_MONITOR/plots directory. The plots are in eps format since this is the ROOT output format.

The latencies between cycles depends on the number of nodes, and is usually under 1 minute. There is a number of flags regulating the behavior of the Data Processors in the format -flag=value, as described in table IV.

Flag Description

m monitorSet. Must be one of the following: GDC, LDC, DATEMON

r run option. Must be one of the following:

ANA_GLOBAL (offline analysis, creates global plots using a file list),

ANA_EACH (offline analysis, creating plots for individual nodes using a file list),

RRD_GLOBAL(continuously creates global plot, using rrds),* RRD_EACH (continuously creates plots for an individual computer, using rrds),

RRD_ONE_COMPUTER (creates plots once for an individual computer, using rrds. This is called by the web server)

x number of bins in the x direction for the graphs. Default is 120

p period option (separate procees for last hour, 20min, day, hourday,month, year, all) This is only relevant with the RRD run options. Default is all

s sum or superimpose option for global plot creation (SUPERIMPOSE, SUM_AND_SUPERIMPOSE, SUM). Default is SUM_AND_SUPERIMPOSE. Only relevant with a GLOBAL run option

b beginDate (format is Day:Month:Year/Hour:Min:Second). Used with -e. Only relevant with an ANA run option

e endDate (same format as BeginDate)

i filename with the list of rootfiles. Default is MONITOR/config/rootfileList.conf. Only used with an ANA run option

c filename with the list of computers. Default is to use the rrds in the /rrd directory. With the RRD_ONE_COMPUTER run option it is the actual computer name.

v index of the variable of a particular monitoring set (selected with the -m flag). Used, for example, to analyse a particular run. This is only used with an ANA run option, and requires the -l and -h flags to be set to define the desired range. The time period defined with the -b and -e flags should be larger than this range.

l lower value of the variable defined with -v flag

h higher value of the variable defined with -v flag

Table IV: Parameters for the Data Processor

7 Round Robin Databases

The Round Robin Databases[5] (RRD) is a very efficient temporary storage mechanism that enables the incoming data to be stored with minimal delay. RRDs work with a fixed amount of data, corresponding to a fixed time deph. The data is structured in rows, with one row for a particular time. The column width is the number of parameters monitored, with each a real number. For every node, each monitoring set has its own RRD, that contains the information shown in table V. The data is always averaged for the given time resolution, which enables only a small amount of data to be read out, making plot creation efficient. The maxima for each row can also be recorded by the RRDs. This feature is used in plotting the maxima as dashed lines, superimposed over the mean values. Of course, for the highest resolution (10 sec in our case) the maxima and mean are the same numbers. As the data is averaged for a particular time resolution, the RRDs can not be used to create histograms. These, however, can be made offline using the permanent data storge. The time resolution can be changed on an existing RRD file by using the rrdtool - tune function (for details see the description of the rrdtool function on the following Web page: http://people.ee.ethz.ch/õetiker/webtools/rrdtool/doc/rrdtool.en.html).

time resolution total time depth

10 sec 1 hour

60 sec 6 hour

4 min 24 hour

28 min 7 days

2 hours 1 month

6 hours 3 months

1 day 1 year

Table V:Relation between the time resolution and the total time depth

8 Directory structure

The AFFAIR code is contained in the $AFFAIR_MONITOR directory and distributed in a directory structure as shown in table VI.

Table VI: Affair directory structure

config

configuration files

6 Configuration files

There are several configuration files regulating what is being monitored and what is being plotted. They are all placed in the $AFFAIR_MONITOR/config directory and are described in this section.

1 computerlist.conf

This configuration file contains the full list of computers to be monitored in a DATE environment, with one computer name per line. Any new node that will be monitored in a DATE environment should be included in this file. Note that if a node is removed by the runControl, there is no need to remove it from this configuration file. Also, it is important to write the FULL node name, such as pcdaq01.cern.ch (i.e. with cern.ch), as this ensures consistency in DIM service declarations between the Collectors and the AFFAIR Monitor.

2 GDC.conf, LDC.conf, DATEMON.conf

These files define which plots are created for the Web server, and how they look (units, scaling, labels,etc). The defaults given with the AFFAIR distribution can be modifed by editing these files. Every monitored variable has a line dedicated to it. The order within the files is the same as the order of the monitored data sent from the nodes. Each of the lines has the parameters presented in table VII. An example is figure 11.

Variable paramater Description

ID The index of the variable. In the first line it is set to 0 (and thus is the first variable monitored), and sequentially goes up to the last variable.

Name The name of the variable. Every variable must have a unique name

scaling constant Used to scale the data. For example, 0.0009728 will scale measured data from KB to MB

Type The data source type used for creating the Round Robin Databases. The value GAUGE means that the value itself is stored, and not the differences from previous values

Global plot option Indicates how the global plots should be created:

-1: the global plots are created only in the superimposed way. An example is Figure 7

-2: the global plots are not created at all

0 or higher: creates a graph of the sum of this variable over all computers. To get different graphs displayed on different plots, set this parameter for each variable to its index (ID). To group several graphs on the same plot, set this parameter for all the variables of the group to the lowest variable index (ID) in the group. An example of this is Figure 5

Snapshot plot option Indicates how snapshot plots of the current time should be creted:

-1: bar plot with only this monitored variable

-2: not created

0 or higher: creates a graph of the sum of this monitored variable and the consecutive variables having this ID. It is the ID of the first variable which should be summed. Examples are Figure 8 and Figure 9

Individual plot option Indicates how the plots for an individual computer should be created:*

-1: graph with only this monitored variable

-2: not created

0 or higher: creates a graph of the sum of this monitored variable and the consecutive variables having this ID. It is the ID of the first variable which should be summed. An example is Figure 7

Shared title Title used by plots when individual, snapshot or global plot option >= 0. It consists of two parts, separated by an underscore. The first part is the title of the set of variables, while the second is the label of each variable. For example: CPU User, CPU Nice. If the second part is AUTO, then the appropriate label from computers.conf is used

Y axis title Label shown on the y axis

index in computers.conf Used to display the actual parameter (e.g. 0 for ethernet card 0, /hda2,...) on the plot. The value is the actual index of the variable in computers.conf, starting from the first parameter (i.e. the 5th position, which is defined as index 0, see table IX)

plot type Groups the plots in the Web interface. This way only a few related plots (such as CPU, IO, ...) are shown at one time

Global plot option - multiple variables Makes possible to have superimposed graphs of different monitored variables. This is used to display on one plot the DDL throughput for all the used DDLs (defined by the last parameter in this configuration file) for up to 20 different monitored nodes. It is the ID of the first variable which should be superimposed.

ID of reference variable Plots the graphs only if the value of the monitored variable referred is >= 0. (For example, this is used to display only active DDLs (active if idDdl -monitored variables 8 to 31- is >= 0), out of a maximum of 32. So if only one DDL is active, only one graph instead of 32 will be shown)

Table VII: Parameters for the GDC.conf, LDC.conf, and DATEMON.conf files

Figure 11:The DATEMON.conf configuration file

3 monsetDefault.conf and computers.conf

The monsetDefault.conf file provides the default parameters sent by the AFFAIR Monitor to the Collectors, with a line for each monitoring set. If a parameter is set to AUTO, then the Collector will determine its value. For example if the paramater for the ethernet card is set to AUTO, then the Collector will use the card with the largest traffic.

The monsetDefault.conf file is also used to control which monitoring set is displayed on the Web and the text that accompanies it. A hash mark on the first space of each line causes the web to ignore this monitoring set. The format in the monsetDefault.conf file is shown in table VIII.

monitoring set (if not used, put a # mark at beginning) text displayed on web (underscore becomes blank space) (currently not used) monitoring time period (currently not used) sequence of one or more parameters sent to monitored nodes

Table VIII:format of the monsetDefault.conf configuration file. In the case of DATEMON, the eight parameters sent correspond to time period,unused, partition name 1 and 2 and 3, disk name 1 and 2, and the ethernet device (see table III ).

The final parameters returned from the nodes and used in subsequent AFFAIR running are stored in the computers.conf file, with a separate entry for each monitored node and monitoring set.XXXnode The format in the computers.conf file is shown in table IX. The parameters can be changed (for example, to switch to a less used ethernet card, say from 0 to 1) and they will be used next time either the AFFAIR Monitor or the appropriate Collector starts.

computer name monitor set time interval (currently not used,set to AUTO variable specific parameter

Table IX:format of the computers.conf configuration file

7 Offline storage and analysis

The ROOT file is closed and transfered to the $AFFAIR_MONITOR/backup directory after about ten hours of data taking. It is then renamed into a filename indicating the start and end of data taking, as well as the monitored node name and the monitoring set, with the following format:

NodeName_MonitorSet_HH_mm_DD_MMM_YY___HH_mm_DD_MMM_YY.root

For example, if a GDC ROOT file for the monitored node pcdaq01.cern.ch was created at 23:52 on the 5th August 2005 and closed at 9:35 on the 6th August, it is named pcdaq01.cern.ch_GDC_23_52_05_Aug_05__09_35_06_Aug_05.root

1 ROOT file interface for the offline analysis

The types of plots accessed online by the Data Processor for a fixed time depth can be created offline for any desired time period by using the generated ROOT files. Table IV explains the different options available. For example, to have an analysis of LDCs for the period between 20:00 on the 5th May and 23:30 on the 8th May:

$AFFAIR_MONITOR/src/root/AffairDataProcess -m LDC -r ANA_GLOBAL -b 5:5:2005/20:00:00 -e 8:5:2005/23:30:00

The assumption is that all the ROOT files are stored or linked in $AFFAIR_MONITOR/backup. The above example creates peformance graphs and places then in the $AFFAIR_MONITOR/plots/user directory. Histograms are now also produced, unlike the Web based graphs. A ROOT file, called user.root, containing all the plots and histograms is placed also in the $AFFAIR_MONITOR/plots/user directory. Global plots are in the top ROOT directory, while individual plots are in the appropriate sub-directory. This file is viewed by typing on the console:

root // to start ROOT
TFile *f = new TFile(``user.root'')
TBrowser b

At this point a browser with ROOT files is shown, and by clicking on ``ROOT files'', followed by ``user.root'' and ``LDC'', the plots are accessed.

2 Web interface to offline analysis

Offline analysis can also be achieved using the Web interface, accessed by the The Offline analysis button (figure 12).

Figure 12:Offline analysis interface

For security reasons password protection needs to be set up. The procedure consists of several steps. First, in the Apache configuration file httpd.conf (usually in /usr/local/apache/conf) the line AllowOverride None needs to be changed to AllowOverride All. Next, three files need to be created in $AFFAIR_MONITOR/web/offline:

.htaccess
.htgroup
.htpasswd

The .htaccess file should look like this:

AuthUserFile /local/affair/web/offline/.htpasswd
AuthGroupFile /local/affair/web/offline/.htgroup
AuthName affair_analysis
AuthType Basic
LIMIT GET
require group allowed
/LIMIT

It is necessary to have the full path for the AuthUserFile and AuthGroupFile parameters. The .htgroup file contains the names of the users. For example, if they are Tom, Dick, and Harry, it would be:

allowed: Tom Dick Harry

The .htpasswd file is created using the htpssswd command. For our example, the procedure would be:

htpasswd -c $AFFAIR_MONITOR/web/offline/.htpasswd Tom
htpasswd $AFFAIR_MONITOR/web/offline/.htpasswd Dick
htpasswd $AFFAIR_MONITOR/web/offline/.htpasswd Harry

The -c flag is applied only for the first user to create the .htpasswd file.

8 Logging

The default outputs of AFFAIR Monitor, Data Processor and Collector are sent to stdout. In the cron jobs examples given above all the outputs are sent to /dev/null. This can instead be redirected to any logfile, as the following example shows for the Data Processor:

source $AFFAIR_MONITOR/startProcess.sh startDataProcessor DATEMON $AFFAIR_MONITOR/log/dataProcessor.log

A much more verbose output is achieved by adding the verbose option as the last parameter of any $AFFAIR_MONITOR/startProcess.sh call.

9 Upgrade

AFFAIR may be upgraded independently of DATE. Unpacking the file affairMonitorNEWVERSION.tar.gz in the $AFFAIR_MONITOR directory and running make does not overwrite existing configuration files and databases.

Bibliography

1 http://www.cern.ch/affair, the AFFAIR web site
2 http://root.cern.ch, the ROOT web site
3 http://httpd.apache.org, the apache web site
4 http://www.php.net, the PHP web site
5 http://people.ee.ethz.ch/õetiker/webtools/rrdtool/, the rrdtool web site