MONARC meeting 26-7-99

Participants

In the Room:
H. Newman, L. Perini, I. Legrand, L. Luminari, U. Gasparini, E. Arderiu, Y. Morita I. Williers, D. Williams, P. Galvez, S. O'Neale, S. Resconi, H. Stockinger, D. Ugolotti, M. Sagaravatto, F. Harris, P. Capiluppi, A. Nazarenko, C. Vistoli, A. Brunengo, L. Robertson, F. Gagliardi, T. Cass, T. Smith, T. Hakulinen
By Videoconference:
L. Barone, K. Sliwa, I. Gaines, N.McCubbin, M. Dameri, J. Gordon

Introduction -- H.Newman, ( ppt)

Some relevant points:

Importance of Marseille meeting and of contact with experiments before it, as clear from the Milestones as set by the referees.
Size of 100kB for ESD to be discussed
Simulation team consiberably increased, coordination issues between CERN, Italy, US, Japan to be adressed

MONARC relevant items from RD55 Workshop -- Eva Arderiu

Some relevant points:

The real need for associations in DB's over WAN (we confirm it is still in our model, albeit the quantity could be relatively small, also between RC's)
Input on AMS requirements if we are interested in enhancements

Validation procedure -- K.Sliwa

The idea is to verify the logical model we have in simulation, and parametrize CPU and I/O in various condition, so that we are able in the simulation to reproduce the results.

When we will simulate LHC era performances, the CPU and I/O assumptions will be of course different, but the reproducibility verified in the present conditions will give confidence, and be a first real validation of the simulation.

Y.Morita asks for more details about how to validate, L.Perini answers some ideas will be presented by M.Sgaravatto and suggests to have discussion after his presentation.

Recent results and plans for Obj tests -- Y.Morita

Results on KEK satellite link are presented, also with DRO option; DRO is working on WAN, considearable overhead is present due also to transaction object size (each handshake seem to require at least 4kB transfer..., not enough info. from Objectivity about the protocol which is used); the performance is now too low, and no one seems to be using DRO in production till now, but solution was proposed at the RD45 workshop.

Data model issues, like the size of the objects, the level and frequency of the associations etc., will be the next step of our studies, after the first validation.

Plan for Obj measurements -- M.Sgaravatto, (ppt)

Tests like the ones already done in Milano (and ported in Padova and Genova) on a single machine, will be done on LAN with 2 SUN machines in Padova and CNAF, WAN tests will be performed with the SUN machines of Milano, Genova, Padova, CNAF, CERN, using also QoS dedicated network CERN-CNAF in the 2 first week of August, provided in the framework of TF-TANT; 15 LINUX PC's are available in Bologna and aTLFAST++ is beeing ported on LINUX. The PC's will be used for multiclient tests (L.Perini adds they are also a possible environment for validation tests based mainly on LAN, providing some data to be reached via association on at least one site WAN connected).

I.Legrand ask to have also pure FTP measurements in the same WAN configuration we use for AMS measurements; H.Newman and Y.Morita stress the importance of the packet-loss parameter, besides the Round Trip Time, and other traffic level. L.Luminari underlines the simulation program should provide the results on the test-and-validation configurations at the same time as the tests are performed.

A Strawman LHC Computing Facility at CERN ( for one single experiment) -- L.Robertson

Slides are presented, detailing the components: processors are assembled in clusters, assembled in sub-farms; each cluster come with a suitable I/O capacity, LAN should not be an issue. For disks assume inexpensive RAID arrays, SAN type if this market develop into high-volume, low-cost: disk system is also probably not an issue. Tapes are a big problem: conservative assumptions are 100GB per cartridge, 20MB/sec per drive, with 25% achievable.

The I/O models are reviewed, AMS-based, and also with LAN-SAN routing.

Layout at CERN can require 400kW on 370m**2; the estimations are conservative, reality may be a factor 2 better (the cost estimations are being redone, the ones available now are one-yer old).

Management of such a system is an issue in need of serious attention (e.g. authomatic recover of AMS server failures, disk failures, processor failures). Four installation like this one can be fitted at CERN in the computer building. L.Barone asks if there is any estimation of the personnel needed.

Architectural Models -- I.Gaines

The graphs circulated on last Friday are presented. DPD stays for Derived Physics Data, indicating the smaller format on which analysis is performed, equivalent to today ntuples.

The figures on the 4th slide have to be discussed with the analysis WG also, like 5, 6, 8, 9; the analysis steps are supposed to happen in a rather organized way, except the DPD creation one, which is supposed to be rather caotic; the analysis at DPD level is not accounted for, as most of it is supposed to be done on the desk-top.

P.Capiluppi raises the issues of which reconstruction is done at RC's: only MC, also second pass on data or first recontruction too (I.Gaines: first reconstruction done at CERN only). Another issue is if the selection of initial sample is done at CERN or at the RC.

S.O'Neale : why not also MC reconstruction on the same desktop used for generation? What is intended for selection? creating a tag or kind of clever copying of events?

H.Newman on many of this points better to continue with an e-mail discussion and then converge faster in a model with variation.

E. Arderiu asks for clarifying which processes in the graphs are read-only.

H.Newman: we need to evaluate the network traffic generated by a "caotic" tranfer of DPD to the desktops.

L.Perini: the capability of performing selection of the initial analysis sample out of the bulk of the data is an important function of a RC, relevant to the justification of the huge storage requirement of the RC; it is however interesting to simulate a model in which such function is performed also at CERN, for comparation purposes too.

Y.Morita raises the issue of ESD and AOD versioning; I. Gaines advocates keeping on-line multiple versions of the AOD and not of the ESD.

S.O'Neale asks if the DPD are necessarily part of the DB, I.Gaines expects they to be also private with a light weigth persistency mechanism.

-Marseille meeting -- L.Perini

The Preliminary Agenda of the Marseille LCB Workshop is shown: MONARC is on September 29th; the Guidelines for the Distributed Computing and Regional Centres session devoted to MONARC are also shown and commented. The rapporteur for the Session is H. Newman, and L.Perini is the session responsible. It is responsibility of the rapporteur to collect and report the views of the different LHC experiments on the issues of interest for his session and to decide who else is eventually going to speak for illustrating them.

It is recalled that the LCB set some relevant Milestones for MONARC, esplicitely connected with the Marseille Workshop:

Show first simulation results on a preliminary LHC computing model "Baseline Model", stamped by the involved LHC experiments.
Show first positive results on the validation of the Simulation using real data from the MONARC testbeds.

Of course the stamping by the experiment is not yet meant as endorsing of the Baseline Model by the full communities, but simply as a green light for continuing to work starting from such a model which is recognized as meaningful and useful by the people representative of the computing in the experiments.

The importance of the evaluation of the Simulation results is stressed; some work on evaluation criteria and how implement them was started by I. Gaines and he is encouraged in continuing the work, asking for any input he thinks useful from the Analysis and Simulation WG's.