ATLAS Software Workshop Minutes

CERN, May 17 - 21, 1999

Day Time

Mon 11:00 Topic A. Khodabandeh: Introduction to Configuration Management
References Slides: HTML, PS, PDF
Summary

Role game: Lamp owner, fan owner, power supply assembly, battery maker to illustrate some of the basic problems
Key questions: What are the problems? What are their causes? What can be done to avoid them?
What is needed? Full bunch of possible keywords
Why configuration management? To know what we have to produce, where it is and in which state, only the right people can use or change it, to understand the impact of changes, to make sure needed information is available, and that agreed procedures are followed
Functions: Configuration identification, configuration control, status accounting, configuration auditing
Configuration items: Anything that needs to be controlled, need to be identified first. For software: source code, test data and test code, design diagram, documentation, project plan, compilers, libraries... in general, everything the loss of which would seriously affect the project. Need to be put into a hierarchy, need to address the class vs instance question
Configuration identification: deals with types of configuration items, organises the structure, naming conventions, version numbering scheme, baseline planning
Configuration control: setup library (eg. software repository) with controlled checkin-checkout administering all versions of all configuration items, guaranteed integrity. Reactions on proposed baselines: bug reports, change requests; requires evaluation of importance, relations, and impact. Rejected requests need to be archived
What about quick fixes? Must be possible in cases of urgency (eg. DAQ system broken), but proper follow-up (only occurence, best fix, side effects, known problem, ...) is required asap afterwards
Configuration auditing: functional audit, physical audit, generate problem report
Status accounting: collect data on status of items, provide visibility of these data (automatic reports, answering of queries), notification, right data and right format
Where to start? First, does one really want to do configuration management? If so, who will be in charge (software librarian, configuration manager, ...)? What is really needed (size and importance of the project, distributed development, ...)? When is it needed? (Elements of CM can be put in place step by step.) How to implement CM?
CM implementation: circular process involving planning, defining procedures, dealing with people, making decisions, automating support, migrating. Tools can help, but they are by far not the only aspect. Tool should be chosen only once it is really clear what one wants to achieve
What is being done in HEP: CVS, various SRT flavours, CMT, SCRAM (CMS), SCaM (CERN accelerator sector)
To go further: check http://spider.cern.ch/Processes/ConfigurationManagement, with commented pointers to other Web pages, book lists etc. Suggestions for updates to spider@cern.ch

Mon 14:00 Topic D. Duellmann: Object data bases as data stores for HEP, part II
References Slides: HTML, PS, PDF, PPT
Summary

Physical and logical model: Federation, databases, containers vs logical view made of objects and associations, allows for optimisations transparent to the user
Reminder of present limits in terms of federation sizes etc.
Basic architecture: lightweight "servers", hence less scalability problems
Examples in /afs/cern.ch/sw/lhcxx,share/HepODBMS/pro/examples
First example: populate a database with persistent events (definition of classes, create federation with data bases and containers, create event objects)
Class definition: in .ddl files similar to standard C++ header files, need to inherit from persistent base class. Persistent classes cannot contain other persistent classes as data members (however, references are possible), nor C++ pointers or references (C++ pointers to be replaced by data base smart pointers). Additional features of DDLs: variable-length arrays as data members, bi-directional associations, 1-to-N or M-to-N associations. Preprocessor (ooddlx) produces schema source code and header files, and puts schema information into the federation
Object browser exists to interactively look at database contents
HepODBMS (RD45 development): shielding layer for independence of vendor or release changes, HEP specific high-level classes
Example code: 'new' operator allows to specify a clustering hint, centralised treatment of clustering hings in HepODBMS
Container limitations: check container size when new object is created, manage a persistent list of containers
Persistent analysis objects: LHC++ uses Objy for histograms, tags and event data. OIDs used to directly access objects. Earlier idea: one federation per user, migrating to one federation per experiment
Ntuple vs TagDB approach: tags more flexible

Tue 09:00 Topic J. Knobloch: Introduction, workshop agenda
Summary

Overview of agenda: 2 tutorials, Changes in Atlas Computing, LCB workshop in Marseille, simulation, training, tools
Analysis tools workshop
Possibly a presentation of the candidate computing coordinator

Tue 09:10 Topic T. Åkesson: Changes in Atlas Computing
References Slides: HTML
Summary

Background: Atlas Computing Review, number of findings leading to many recommendations, requesting immediate actions from management. Report presented to EB which took note, management to put forward action plan to the EB
View of Atlas management: no technical decisions by management, putting together groups of people spanning different views in Atlas, taking into account work done so far in Atlas and in other experiments. Aiming for a system of institutional commitments
Action plan presented in last software workshop, put forward to April EB, approved after some changes, collaboration informed
Components of action plan: architecture task force, quality control group, national board, systems responsibles (SW, reconstruction, simulation, data base task leaders), training
Some uncertainties during the transition period are unavoidable
Mandates and compositions architecture task force and quality control group
National board: caring about networks, platforms, regional centres, collaborative tools, ...; composition: divided according to funding agencies, include a small regional centre working group with Monarc participation
Training: a global issue
Coordinators: computing: responsible for production of software as project leader for core software and as coordinator for detector specific software, Norman McCubbin proposed; physics: setting requirements and verifying performance, Fabiola Gianotti proposed
More precise discussion of mandates

Discussion

Q: What is the relationship between the systems task leaders and the architecture task force? A: There is no formal one
Q: Is it intended to have Atlas-wide software, reconstruction, simulation, and data base coordinators? A: Presumably yes, the system task leaders will form working groups which will elect chair persons
Reconstruction and simulation integration require full-time efforts each
Q: What is the role of the regional centres? Are they supposed to contribute to the core software effort? A: This depends on the case, but for some centres, this seems conceivable
Q: Discussion seems to assume that CERN contribution to LHC computing is known, and that basic decisions about regional centres have been taken, although this is not the case. A: Yes, that is the clear case for the national board
Q: It is necessary now to clarify the role and the status of the overall coordination roles (simulation, reconstruction, data base). A: This is acknowledged, but not too many firm decisions should be made before the computing coordinator is elected. Also, some flexibility in the setup will be required
Q: It would have been very advisable to define that existing mandates would extend until further notice. A: The transition period is not really until the new computing coordinator is fully functional, but until the architecture task force has started
Q: Some existing groups (graphics, control, analysis tools ...) are not represented in the task forces. Does that mean they are cancelled? A: No, this issue will be dealt with in due time
Community wants clear signal that work should continue until indicated otherwise
Q: Is there a time scale for putting the new computing organisation in place? What is the procedure for nominations and elections? A: By beginning of September, there should be a clear picture including all essential nominations. System task leaders to be nominated by systems, overall coordinators will be proposed by computing oversight board
Q: Is there anybody foreseen with an architectural role? A: This is up to the computing coordinator, but it is conceivable that such a person will be required
Q: Architecture is of utmost importance now. What is the mechanism of communication between the architecture task force and the systems? A: That's why the architecture task force should start as soon as possible. Proper communication with the community is vital for the success
Q: When is the architecture task force going to get started? A: There are still points of discussion with the CERN group
There must be strong links between the system task leaders and the architecture task force
Q: If the computing coordinator is at the same time project leader for core software, why hasn't the computing community had a chance to form their opinion and propose somebody? A: The computing coordinator needs to be treated as a coordinator, with one proposal put forward by the management to the CB
Q: Is there any contingency plan for the software?
Q: What about the end of 99 status report? A: This can only be seen with time, once the structures are in place. We should not be looking into the past too much
Q: For the nominations of the coordinators, have the communities in other cases not been consulted as well? A: The collaboration has been asked to give their input, which has been very carefully considered by the management. The central task is to bring the communities working on the new software, and on the physics TDR software, together
It would have been wiser to seek the support of the computing community for the nomination
For the LHCC status report, advice by the referees should be seeked as to what the report should address
The idea of defining work packages is considered very useful
There is some hope that people will look at the changes in a positive spirit

Tue 10:50 Topic M. Stavrianakou: Repository and releases
References Slides: PS, PDF, SDD
Summary

Production software mostly stable except for reconstruction, Atlfast and applications (being ported to more platforms)
C++ software steadily evolving
Suggestion to have an overview over existing C++ packages given in one of the next workshops
Supported platforms: HP, DEC, IBM, Linux, Solaris; not supported: SGI (some work done in Boston), WNT
Releases: roughly fortnightly (26 so far in one year), nightly builds (not really used yet by developers); Fortran software usually builds, problems with new software. Production release planned for June
Outstanding issues: generator packages out of date (next physics coordinator to take up), package author list to be updated, move to new CLHEP and CERNLIB to be scheduled, SRT compilation log analyser, releases both in debug and optimised mode, librarian support (deputy)
SRT: some improvements required (documentation, functionality - cope with increasing release size, concurrent debug/optimised releases, non- global releases and sharing of binaries). Spider/SRT project stopped by IT division. Maintenance problem - action is needed now

Discussion

Atlas10 (new HPUX 10.20) has got problems linking
What about IBM? Not clear whether they will move to the C++ standard
After the sudden death of the Spider project, we are to review the situation and evaluate the existing solutions, task force is needed
Could try to confront Spider work model with Atlas requirements

Tue 11:10 Topic D. Rousseau: TDR software and productions
References Slides: PS, PDF
Summary

Simulation: Dice mostly frozen in February 1998, with the exception of the muons. Number of productions done with obsolete geometry. Dice in cvs, but cvs version not used for production yet. New pile-up method used in calorimeters
Reconstruction: common clustering, IPATREC and XKALMAN widely used, PIXLREC less heavily used. XKALMAN++ tested, will be moved to repository. Calorimetry: JetFinder libraries, more detailed output. Muon system: better pattern recognition and fitting for low pt, correct covariance matrix, timing improved
Timings: no optimisation for CPU time done yet
Combined reconstructions: complex matrix of combinations. Not all algorithms properly integrated in Atrecon because of timing, some just designed to run on the combined N-tuples
Vertexing: conversion finding, K0_s and secondary vertex, primary vertex
e/gamma identification
Combined muon measurement: two approaches: combining muonbox tracks with xkalman tracks; refit of ID hits with muonbox tracks
Lessons to learn from CBNT: different usage in groups, both for optimisation of combined performance tools and physics studies. Hitting annoying limit of 50'000 variables, hence careful tuning of contents required. Small size appreciated for export. Interest clearly demonstrated, but solution should be better than Hbook + Paw
Productions: simulation: lots of different channels, bottleneck was person power for supervision. Reconstruction: mostly done on private basis, not much centrally organised
Conclusions: TDR software ready in time, all holes successfully filled. Next steps (before transition to OO): collect and archive information about productions, collect code so far kept private, get comments of people using the software about their likings and disliking, check DICE CVS version. What about coping with updated geometry: implement that in Dice or move to Geant4 first?

Discussion

A decision needs to be made about the potential DICE geometry update
Tendency is to go for a freeze of the Fortran software, unless the new coordinators decide otherwise

Tue 11:50 Topic H. Meinhard: Platforms, other Focus issues
References Slides: HTML, PS, PDF, PPT
Summary

Focus: What it is, its mandate, new chairman (Paul Jeffreys) and new secretary (Marco Cattaneo)
Widely accepted trend to go for PCs for physics data processing; Linux and NT (the latter mainly for commodity and productivity software)
Policy proposal, generally accepted: Improve support for Linux and NT, discourage investments into Risc hardware, commercial Unix O/S to be frozen, end date for support to be determined
Other items: Storage management, HSM (continue HPSS, but no firm decision until end 2000), shift software (major revision ongoing); Y2K; new printing service; changes in IT structure
Distribution of LHC++ and G4: one compiler and OS per platform for LHC++, taking source code changes for unsupported combinations back into repository, and accepting contributed binaries into standard places. CLHEP released more frequently than the rest of LHC++. Licensing and access restrictions under discussion, will be open (GPL). Geant4 so far only distributed as source tar files, will distribute optimised libraries from next release on

Discussion

Printers: difference with xprint, does xprint still exist
Asis: still around in three years?
LHC++ distribution via asis? No progress yet

Tue 12:10 Topic J. Knobloch: LCB workshop
References Slides: HTML, PS, PDF, PPT
Summary

Date: September 28 till October 1st, place: Marseille (France), Web site: http://marcpl2.in2p3.fr/LCB/. Participants: mainly people from LHC experiments, some from IT, BaBar, Fermilab. No quota
Subjects: Architecture (components, design issues), technology tracking (networking, processors, memory, storage), world-wide computing (Monarc, collaborative tools, data management), simulation, analysis tools (available tools, practical experience)
Format: Each session introduced by rapporteur summarising the thinking of the experiments, much time devoted to discussions, contributions from community by early July (abstracts and Web links to get the rapporteurs interested)
Next: propose rapporteurs and conveners, establish guidelines concerning issues to be treated, propose and prepare Atlas contributions

Wed 09:05 Topic J. Apostolakis: Geant 4 status and experience
References Slides: HTML, PS, PDF, PPT
Summary

Version 4.0.0 in December 1998, marking end of R&D; collaboration formed since then
Very powerful G4 kernel (tracking, stacks, geometry, hits), physics models, additional capabilities (persistency, visualisation), greatly surpasses Geant 3
Tracking: general and flexible; event: powerful stacking at no extra cost; geometry: hierarchical or flat; voxels for speed; hits: user defined
All processes at least at G3 level; hadronic processes: distinguish process and model, models data driven or parametrisation driven
Geant 4 experiences: Atlas, CMS (from AIHENP99), Borexino, BarBar (fast simulation)
Comparison by CMS of Geant3, Geant4, and test beam data
Borexino application of Geant 4: very realistic simulation of the geometry and the processes
BaBar: using Geant4 for their fast simulation (simplified geometry), using G4 facility of parametrised processes. Full simulation with G4 under development
Benchmarks: focus on EM physics performance, compares speed at constant physics, physics at constant speed, using two configurations (thin silicon, sampling calorimeter). Performance lead of Geant4 over Geant3
Since January 99, urgent patches with fixes released, consolidation release due end May 99 (fixes, minor improvements, few more models, ability to use STL rather than RogueWave); 4.1.0 scheduled for end July 99 (additional physics models, more functional improvements)

Discussion

Q: Are there tools to do pile up studies for hits and/or digis? A: Yes, something was implemented, Makoto Asai is the expert
Q: Were there any geometries created from CAD systems? A: Not known

Wed 09:30 Topic A. dell'Acqua: Detector simulation activities in Atlas
References Slides: HTML, PS, PDF, PPT
Summary

CHAOS project: Aim to exploit Geant4, to implement the future Atlas simulation program through OO analysis and design, and to embark new people (on Geant4, on C++, on OO)
Start with a core group (former G4 members), follow a formal process, iterating through categories, OO design, implementation
Prototype work started to get people interested, lots of buglets in Geant4 found
Training program set up, simulation group being built. Aim: simulation program which surpasses DICE in functionality, and which is maintainable
Categories: concentrating on ChaosSimulationControl, detectorConstruction, detectorDescription
Design category by category (example: run and event category)
Prototyping going on practically on all subdetector systems (with the exception of EM calo). Muon system almost completely simulated, takes detector description from AMDB, integration of B field classes, tracking in magnetic field. Problems due to a bug in Geant4 (chambers were not transparent...). Hits being implemented now, will use detector description scheme (by RD and Christian), direct reading from AMDB meanwhile. Acceptance studies; medium term: interface to reconstruction
Another prototype: silicon tracker (Makoto Asai), rather formal approach involving careful OO design. Parametrisation pushed to its limits
Tile cal testbeam: Code written for G4 course, not very well structured, being redesigned, many problems in tracking in Geant 4 found (and meanwhile solved). Adding hits, digits, N-tuple facilities. Putting everything into G4 framework. To be confronted with Geant 3 data
Work going on with TRT simulation (Maya), hadron endcap and forward calorimeter (Rachid), TGCs (H. Kurasige)
Missing: EM calo; geometry cannot be built the same way as with G3, performance implications to be understood, new functionality for G4 geometry may need to be requested
Training: first course given on 16 - 19 February for Tile cal (20 people). Positive feed back, but course was too short. Another course arranged for muon group end June, some participants from ID, aiming at 5 days course
Bits and pieces are falling into place, can start building mock-up geometry, need to push for more test beam simulations, give component model a try, improve communication (Web page...), interface with other domains of Atlas OO software (detector description is of utmost urgency), work on generators (Fortran wrapping), simulation jamboree (delayed from May to summer, with new coordinators)

Discussion

Q: What size is the core group? A: About 10 people
Q: Is the silicon prototype for the barrel only, or does it include endcaps? A: For the time being, it's barrel only, but with hits and digits
Q: What is Momo? A: It's the graphical user interface used in Geant4
Q: Is the testing procedure in Geant 4 adequate, given that such an incredible bug in the tracking could slip through? A: No, it is being improved
Q: Do we have a mechanism to decide which of the patches to install? A: For the time being, we install all patches
We should absolutely avoid creating another private version...
Q: What fraction of the Geant4 code are we testing with our examples? A: We aim at testing most of the physics, and of the geometry. Importing from CAD systems is not yet tested by us. Most other parts have not been tested a lot
For the link between detector description and simulation, working meetings need to be organised
Q: For how long do we need to maintain Dice? Which version of the geometry should be ported to Geant4? A: The latest Dice geometry should be ported, and Dice development should be stopped, but the changes in ID geometry need to be discussed
Q: What is the status of the changes of the pixels? A: There is code for Dice in CMZ, but it has not much been tested
Proposed decision: suspend implementation of the changes in Dice unless strong evidence is put forward that it is necessary

Wed 10:25 Topic M. Stavrianakou: TRT test beam sector prototype simulation with Geant4
References Slides: PS, PDF, SDD
Summary

5 sectors of 16 planes each of 1 radiator plane and 16 planes of straws each
G4 particle gun, all standard G4 physics processes included, problems being investigated. Different TR models to be used once available
Hits implemented as simple objects, digits implemented rudimentally
Detector geometry debugged (using DAWN and DAVID), tested with tracking (10 k pions), material "measured" by shooting Geantinos (8...10% X0)
Results (energy deposit, number of hits) for incident pions look qualitatively correct, more quantitative checking needed by experts. For electrons, there is an unphysical peak in the energy deposit which is being looked at
Next steps: Wait for electron bug to be fixed, study electrons with and without TR, improve digitisation, more realistic geometry, investigate fast simulation options, first prototype of persistency, transient and persistent histograms, more complete test beam setup

Discussion

Q: Is the energy deposited in the straws equal to the energy difference between incoming and outgoing particles? A: Not yet checked

Wed 11:00 Topic C. Onions: Training
References Slides: HTML, PS, PDF, PPT
Summary

About half of the countries have nominated training contact people already. The US nomination has been withdrawn
Course for systems coordinators and task leaders: Course on June 14 - 18 at CERN on Hands-on Analysis, Design and Programming with C++, being filled slowly, but can stand more applications
Consultancy: specific names not yet identified, suggestion is to use the mailing list atlas-sw-developers@atlas-lb.cern.ch
Other activities: Consultant discussed training needs, awaiting report; discussion on a C++ tutorial series by IT division
UCO presence in building 40, including book shop; update list of recommended books, CDs, videos; complete training contact list; fill course
Next goals: Identify good physics examples; organise design and code walkthroughs; prepare guidelines for de-centralised training
Other courses: Geant4 by Andrea dell'Acqua
Training Web page: http://atlasinfo.cern.ch/GROUPS/SOFTWARE/OO/training

Discussion

Marvellous job on C++ and OO, but other areas underrepresented (cvs, SRT, ...)
Q: Is there a registration fee for the courses at CERN? A: Yes, it's courses provided by Educational Services who charge 200 CHF / day
Q: What is the status of Java? A: Not being pushed for at the moment, no need to make special training efforts. Also, there is the series of IT tutorials on Java

Wed 11:25 Topic S. Fisher: Case tools for Atlas
References Slides: HTML, PS, PDF, PPT
Summary

Full story: http://atlasinfo.cern.ch/Atlas/GROUPS/SOFTWARE/OO/tools/case/
Questionnaire prepared for Atlas, few replies received. Invited other experiments to reply. In total, 19 (15 Atlas) replied
People want both a simple and a big tool
Should not restrict ourselves on one tool
Customisation is important
Operating systems requested: NT, Linux, Solaris
Little difference in what people use Rose and StP for, but greater satisfaction with Rose
Best points about Rose and StP: much in common
Worst points: Rose: crashes often on Unix, very high memory and CPU consumption; StP: Speed of execution, speed of startup, excentric user interface
Rose: only supports commonalities between OMT and Booch, not really full UML - missing activity diagrams, code repository missing
StP: ...
Evaluation of tools: Looked at 12 tools which support UML on Unix and NT, with C++ support. Details in the Web page
To be considered seriously: StP UML 7.1: major improvements, but NT only for the time being, Unix version in September/October. Now real Windows program, uses Sybase or Access. Startup: ~ 3 seconds, but high price to pay in terms of performance for central repository at CERN. Interface greatly improved
Very interesting: Together; round-trip is automatic for both Java and C++, model and code are always in step, works actually well. Written in Java. Performance fine on a 450 MHz PII, easy to use, can define own way of mapping UML objects to code. Small text files (no data base), facilitating group work. Code held in memory - need to restrict to a reasonable size package. Scripting with Java or Python. Whiteboard edition is free. Documentation generation excellent. No support for name spaces and nested classes. Summary: superb, reasonably priced tool
Argo: free, open source, no C++ support as yet, supports XMI, lots of good ideas in interface, rather buggy, suffers from the slowness of Swing. Could perhaps take off
Recommendation: For heavy tool, stay with StP for the moment. As light tools, both Together and Argo are very interesting

Wed 12:10 Topic S. Fisher: Status of reviews
References Slides: HTML, PS, PDF, PPT
Summary

Dig decided that ongoing reviews should be completed
SRT documentation and design: Waiting for new deliverable
Muon code, graphics code both waiting for one reviewer
"DG" issues (issues to be brought up later): important requirement and manpower issues. Some came up in SRT review: srt configure rather than autoconf, manual to give advice on sensible use of CVS, missing chapters. Proposal to ask QC group to take these points on

Discussion

Q: What about work finished now? A: No formal decision has been taken; we should encourage to carry on reviewing

Thu 09:10 Topic M. Stavrianakou: Introduction
References Slides: PS, PDF, SDD
Summary (see slides)
Thu 09:20 Topic S. Fisher: Analysis tools requirements
References Slides: HTML, PS, PDF, PPT
R. Somigliana, K. Sliwa: Data analysis software tools requirements (draft)
Summary (see slides)
Discussion

We need to become clear what we mean by Analysis
One should not try to assign responsibilities too early, it is more important to have a complete view of the requirements
Why have we replaced the requirements collected in the graphics domain by something actually worse? We ought to collect more high-level requirements
Need to distinguish very clearly between user and system requirements (the latter can be considered methods)
Should try and have a common comprehensive set of requirements which can be projected according to the viewpoint
Important that requirements are as complete as possible, and ranked according to priority. Every effort must be made to capture the requirements of the end user, should not worry about formalisms of the requirements document too much
Q: How do we proceed? A: Input hoped for during this meeting, at the end we could establish a small working group to carry on
It would be useful to make sure brainstorming results are not lost
The requirements capture phase should not take too long (much less than a year), input solicited from all physics groups and systems
Requirements should be testable

Thu 10:00 Topic L. Tuura: Architectural issues
References N/A
Summary

Architecture is a moving target, all software is going to be re-written a couple of times. Should not become religious about these issues
Important to focus on the target, not on the methods
Need to deliver now, even if not perfect
Need to collect experience, requirements may be modified with experience with the software
Strategic choices to be made: whole chain from event filter to user analysis should be consistent, packages should be movable from one application domain to another; should be as independent on a specific tool as possible; ease of use or experimentation puts requirements on performance; C++, object orientation, component architecture

Discussion

The new architecture task force is the forum to discuss these strategic choices, they must be supported by the whole collaboration
Q: Re-writing parts of the software is fine in the component model as far as components themselves are concerned, but what if we want to change the glue between in the components? A: This glue is a very thin layer anyway
Can the Event Filter afford objects or components because of their potential overhead? Of course, this is an implementation issue to be watched out
Q: Shouldn't we aim for large components in order to reduce the fraction of glue in the system? A: Abstractions should go bottom-up
Even Level2 algorithms should run in the same framework

Thu 10:30 Topic C. Tull: StAF Architecture
References Slides: HTML, PS, PDF, PPT
Summary

StAF: Standard analysis framework
Challenges: data volume, processing power, dispersity of community, time frame of projects
Framework: reuse of code that calls application code
StAF: horizontal and vertical modularity, explicit graceful retirement, adoption or imitation of industry standards, scripting access to code API, relies on code and doc generation tools, dynamic linking and unlinking, multi-language support
Horizontal modularity refers to ASP domains, vertical modularity reflects protocol and interface standards; Corba actually used
Objects and packages: ASP (analysis services packages), PAM (physics analysis modules), data objects, error stack, result stack
ASP: 1 object factory class, 0 or more worker object classes. Communication via software bus
User code: C, C++, Fortran possible
Data set access: self-describing data based on XDR and Corba IDL, running on large variety of systems
Data navigation and manipulation a la Unix
User interface: scripting language preferred over GUI, interface reflects underlying classes
Used in Star, Phenix, AGS, Grand Challenge, Clipper
Framework approach largely exploited in existing analysis programs, interesting tools and technologies available. Framework and infrastructure must be available early, and long

Discussion

Q: How does StAF relate to the component model? A: The latter is at a lower level
Q: Can all modules or components talk to all other? A: Yes, if they have compatible interfaces

Thu 11:25 Topic RD Schaffer: Data mining for analysis
References Slides: HTML, PS, PDF, PPT
Summary

Fairly traditional event structure: raw data, event summary data, analysis object data, event tag. Constraints: raw data will mostly remain at CERN, ESD hence needs to contain enough detail to re-create the AOD
Logical view: From event header, navigation possible to all parts of the event
Event collections: collection holds a set of events within a particular context (selection of events, selection of parts of events)
Key point: optimise access and turn around for typical analysis; handles: layered access (tags, aod, esd), organisation (column-wise, indexing schemes, reclustering of sparse events). Need to understand analysis scenarios - what event selections, and what parts of the events, will be accessed most?
Current status: fairly early. Event model: access to raw data. Work started on general tools for ESD and AOD. Available from RD45: Event collections, hierarchical naming facilities, lightweight persistency prototype

Discussion

Granularity of data access still needs to be defined, clearly depends on physical parameters of Objectivity, but realistic scenarios for analysis are required
For users, not only copying a subset of the event information is required, but the user must also be able to add his own information
During analysis, the data base should be updated only in a very well controlled way
How is the question of storage vs just-in-time processing addressed?

Thu 11:55 Topic M. Stavrianakou: Evaluations of tools
References Slides on existing tools: PS, PDF, SDD
Slides on evaluation: PS, PDF, SDD

Summary

Kinds of analyses to be considered: real physics and tst beam analysis, but also online monitoring, calibrations, ... Specific requirements? Scenarios / use cases? Can we extrapolate or interpolate?
Physics and detector communities to provide input, via questionnaire
Could use questionnaire from D0/CDF on run II data management needs, would have to be enlarged
Available tools: CERNLIB components, LHC++ components, ROOT, JAS, OpenScientist, others (HippoDraw, Grace, various commercial tools)
How to choose? evaluation according to requirements and use cases, compliance with the architecture, choice of standards, programming languages and technologies, resources, timescale
Functionality to be considered: I/O, histogramming, fitting, plotting, ...
I/O: Objy/DB, light-weight persistency, ROOT, Root I/O, ... Choice depending on data types, volumes, access and selection patterns. Should be as decoupled from the rest as possible; should be possible to use the appropriate tool for each application, not necessarily always the same one
Histogramming: Root, HTL, JAS, OpenScientist, ... Some soon to be interchangable, some offering new functionality. How important is the association to the raw data?
"Toupling": HEPODBMS event coolections and tags, Root trees and Ntuples, HepTuple
Fitting and minimisation: Minuit, Gemini, NAG, ... Must be interchangeable
Plotting: Paw, Root, HepInventor/HepExplorer, JAS, OpenScientist. 2D poorly addressed by commercial tools, PAW/ROOT paradigm still very appealing. Interchangeability highly desirable
Interactivity: Paw, Root, HepExplorer, Jas, OpenScientist. Paw/Root paradigm very applealing, HepExplorer not convincing the end user, JAS being considered as alternative, OpenScientist very interesting. Scripting functionality very essential, possibly choice of more than one. Could use SWIG to interface with Perl, Python, Tcl/Tk, ...; CINT debated controversially
Usability and performance, modularity and flexibility, maintainability and extensibility, replaceability, restrictions imposed by languages and standards, and by legacy software, resources and timescales
Atlas should decide which tools to evaluate, LHC expts or CERN or HEP should make an effort to coordinate analysis tools development work
Plans: start review of requirements, start evaluation exercise

Discussion

The problem is that the replies to the questionnaires tend to be biased
This discussion is years too late
A Hep-wide evaluation and coordination may face serious problems
Interoperability is a very desirable aim, but we really need to understand what we are talking about. A Hep-wide standardisation effort would be invaluable
Information about physics channels, data volumes etc. still exist from the CTP times, should still be available
Q: What is the role of the CBNT? Will it be freezed with the Fortran software? A: Should serve as a starting point for a more conforming, more functional implemenation
HEPCCC could also be considered as a body to discuss this question
LHCb said to have partly and temporarily accepted the Root I/O scheme
Some more perspective into the future, and forward looking judgement, is required

Thu 14:05 Topic J. Schwindling: MLPFIT: a tool to design and use multi-layer perceptrons
References Slides: HTML, PS, PDF, PPT
Summary

Artifical neutral networks started in the 40ths, now widely used in many areas. Hep started using them in the late 80ths for classification (particle id, event classificiation, search for Higgs), track reconstruction, trigger, function approximation
Multi-layer perceptron: hidden layers between input layer and output layer apply non-linear function to linear combination to layer above them. Output neuron is a linear combination
Theorems: this network can achieve function approximation, good discrimination between signal and noise
Learning phase: tuning of parameters to minimise a chisquare similar function, requires first derivative of errors
Learning methods: stochastic minimisation (linear model with fixed or variable steps), or global minimisation. Stochastic minimisation works sometimes badly on deterministic problems. Hybrid method by far fastest
MLPfit designed to be used both for functional approximation and classification. Implementation: 3000 lines of C, precise (double precision), fast, inexpensive (dynamic allocation of memory), easy to use
Performance: at least competitive with well known and widely used packages (Jetnet, SNNS)
MLPfit reads Ascii file with all parameters, writes Ascii files containing its results, and Paw files. Use dgels from Lapack to solve linear least squares problem
MLPfit is callable, parameter passing via dedicated routines; Labview and Paw interfaces available, too
Plans: support the code, improve it, test it

Discussion

Q: What about the argument against neural networks that you never find something unexpected, and that you never exactly know what you are doing? A: If it was deterministic, it was linear; one would we missing the additional power of this non-linear approach
Q: What about error estimates for the result of a neural network? A: This is being studied by mathematicians, no satisfactory answer yet

Thu 14:50 Topic D. Malon: Data mining
References Slides: HTML, PS, PDF, PPT
Summary

What does data mining mean? Examples of standard application domains and questions. Essential characteristics is knowledge discovery, not information retrieval. Commonly emphasis on patterns and models. In HEP: some people mean random access to large amounts of data, others to computationally intensive event-by-event analysis
Unlikely that COTS solutions can fulfil our requirements
Start from idea of accessing condensed data set on disk, navigating to more comprehensive information on tertiary storage transparently. However, arbitrary random access to tapes is to be discouraged
Tag data base: basis for selection
Issue has been addressed by DOE Grand Challenge Project; one aim is to provide information about the number of tape mounts to be expected. Possible to coordinate the staging requests from within the running job, and from other requests. Idea is that the exact order of events in processing does not matter (foreach vs. standard loop approach). Easy implementation of parallelism. Corba widely used (can be hidden)
Less ambitious models: data trains, carousels: looping continuously through all data, serving events as they are retrieved from the media
Preferred strategy is still to produce compressed disk-resident data sets
Rough sets being studied

Discussion

Q: Are there implementations of rough sets? A: Begun to build a prototype based on the Star tag database, but no results yet. One has to worry about biases when concentrating on events which are disk resident
Once data base has been populated with N-tuples or Atlfast++ files, we could try out some of these concepts

Thu 15:25 Topic J. Hrivnac: Graphics for analysis
References Slides: HTML, PS, PDF, AG
Summary

Evolvability: need to ensure that system is future-proof
Mission statement of graphics domain
Design of graphics made with flexibility in mind - objects should not know that they can be visualised, design independent on any particular visualisation tool
High-level graphics design; data input from event domain (Zebra tapes, Objy database), or simulation (MC truth)
Use cases: Object browser: tree a la MS Windows Explorer, drag-and-drop functionality; end programmer. Automatic creation of visualisation base classes for any given (non visualisation aware) classes. Scene developer to write two classes, Scene and Rep
Democracy of scenes - not all reps need to be implemented for all scenes
XML being used for exchange of Ascii information, found to be very useful. Future: further application, more direct mapping between Objy and XML files
Current status of graphics: definitions: real and graphical objects, operations on real and visualised objects, views, static objects, streaming objects, volatile objects. Requirements: general, existence, environment, functional requirements. Design done, underwent review. Implementation: ED, modeler, resolver functional, will be upgraded to improve design and functionality; object browser foundation and prototype exist; Tree builder temporarily implemented
Views: Event display like: AVRML (3D) fully implemented; Aravis (ramped-up Arve graphics) implemented, reps to be implemented; Atlantis (sophisticated ED) well implemented, problem with data access; Wired (Java based ED working over the Web) well implemented. Statistical/histogram views: HbookTuple implemented, quite obsolete; HistOOgrams simply implemented, quite obsolete; AHTL (HTL histograms) implemented, AOS (OpenScientist histograms) planned, AJas (JAS histograms) planned, Orca (simple integrated environment) basically implemented. Misc.: AsciiText fully implemented, XMLScene implemented, Command
Interface to data structures or classes need to be implemented, participation from systems requested
Documentation: Implementation guidelines, FAQ, design document, package documentation exist
Domain interface: Graphics is what you see on the screen; shows the results of the analysis

Thu 16:00 Topic S. Resconi: Status of ATLFAST++ in LHC++
References Slides: HTML, PS, PDF, PPT
Summary

Based on concept of makers, many of which are implemented
Main steps done: eliminate dependencies from Root, check results, put data into Objectivity, visualisation with HepExplorer
Root dependency: containers replaced by STL and VArray; interface to physics processes rewritten
Comparison with Fortran version: checked on sample of 10'000 events for two popular channels
Data into Objectivity: following tag/event data model proposed by RD45, one single container, no associations
Read data from Objectivity: visualisation with HEPExplorer, perform analysis map. Alternatively, read data with simple C++ program, being used by Monarc - 100'000 events filled (1.7 GB events, 8.4 MB tags)
Further refinements needed, still some Root classes in, more physics channels to be checked; can be used to learn more about LHC++ components, and to evaluate analysis tools

Discussion

What is the evolution of Atlfast in general? What should be maintained in the long run? Should the fast simulation not be provided by Geant4? More discussion needed in the physics community
Repetition of Atlfast++ tutorial could be very useful

Thu 16:30 Topic M. Dönszelmann: Java agents
References Slides: HTML, PS, PDF, PPT
Summary

User requirements: Access to data (may need to move jobs rather than data); access to processing power (1300 machines as of today); analysis jobs need to be restartable/checkpointable; ease of use for physicists
System requirements: move jobs rather than data (ship the code securely, ship the state objects); run on multiple CPUs (platform independent, merging of results); checkpointing
Mobile agent paradigm: remote procedure calls vs remote programming; the latter can deal with larger chunks. Serialisation and de-serialisation required for transfer of state to another machine
Java as implementation of mobile agents: platform independent, secure execution, object serialisation, reflection. Several implementations on the market
Mobile and stationary agents - topologies: serial, parallel (merge agent), parallelisation at various level
Agents involved: Job, Result, Task, Data
Prototype implementation in Cern School of Computing: 35 machines, each with 150 MB data locally stored. Running small program in Java, traversed 5 GB in less than 10 minutes (shipping 5 GB over 10 MB/s takes 1.5 hours)
Problems to integrate with Fortran/C++; also not easy to make fail safe

Discussion

Q: What if an agent does not come back? A: There are some ideas how to cope with that, but it is not currently implemented in any of the systems on the market
Q: Should we understand the talk as a recommendation to skip C++ and go to Java? A: Yes
Q: Would it be possible to cope with the Java/C++ problem by restricting the agents to manager functionality? A: Problem is that the state needs to be preserved
Q: In a scenario where platform independence is not important, what is the advantage of the proposed scheme over one with, say, Condor and/or PVM? A: No interception of system calls required when using Java agents

Fri 09:00 Topic S. Fisher: Status of Spider
References Slides: HTML, PS, PDF, PPT
Summary

At last meeting, not too much optimism; now, work is suspended. Main problem is idea of common project, not all experiments behind. IPT would like to carry on with willing partners only
Two projects: CodeCheck - no action required; no advantage to be gained from Spider at this point in time
SRT - work was driven by Atlas, CMS have started their own system. Could do nothing, but in any case, we may need to go ahead without formal interaction with others. Unlikely to have a HEP wide SRT, except via informal contacts
Criticism to the existing SRT
If we go ahead, create requirements document (have benefitted from contacts with other experiments); get it reviewed (offline and online experts, innocent users); compare it to existing solutions, evaluate them including their cost, next steps depend on the outcome of this process

Discussion

Q: Why don't we go ahead with the current SRT? A: Changes are required, and there are maintenance problems
Before we know the requirements, we cannot evaluate the tools
Q: What is the time scale? A: Requirements in not much more than a week, review takes a couple of weeks, evaluation again a couple of weeks
Q: What does Geant4 and LHC++ use? A: Plain make, this is one of their problems
Should not rule out SRT right now, if we need it we will find the manpower
Need to take conversion cost into account which are somewhat unclear
Would be a pity not to be able to exploit IPT person power
For maintenance, should focus on organisations, not persons
Questionable whether a short-term evaluation can address all issues of scalability and maintainability
Existing commitments and milestones need to be respected
Q: What is the functionality issue? A: The main point is that everything is rebuilt for a release
Should in any case stick to the idea of online and offline using the same tools
Should be discussed at next LCB workshop

Fri 09:40 Topic RD Schaffer: Report from data base WG
References N/A
Summary

Topics: production data base, fallback solution, monarc, detector description
Production data base workshop: 11/12 May at CERN. Broad variety of solutions exploited now. Agreement of identification scheme for built parts. Recommendations for central and local tools to follow
Detector description: exploiting OIF (Object Interchange Format) parser, infrastructure exists, now organising working meetings with systems. Want to go for simplistic Ascii format, not XML right now
Simple persistency (also with a possible fallback for Objy in mind): small prototype of persistency tool based on Objy ideas to be released in summer (RD45 meeting in July)
Monarc test bed with Objectivity/DB: performance measurements on wide scale distribution of data

Discussion

Q: Why is the simple persistency going on? Are people not satisfied with Objectivity? A: There are several motivations - one is to have something lightweight which could easily be used on a portable, the other is to evaluate the potential cost of a home-grown OODBMS solution. This package is ODMG compliant
This is a remarkable change of strategy
Good news that risk assessment is taken seriously by RD45, up to LCB to judge whether this is a reasonable strategy, there is a review process in place

Fri 10:00 Topic J. Hrivnac: Graphics WG
References N/A
Summary

Discussion about plans and schedules
XML proposal

Fri 10:05 Topic G. Poulard: Reconstruction WG
References Slides: HTML, PS, PDF, PPT
Summary

Atrecon: huge amount of work for physics TDR, need consolidation (stamped version as physics TDR reference one); difficult to reproduce all physics TDR results. Evolution: keep running version, some improvement requests (detector geometry), adiabatic evolution to C++
New structure and strategy unclear due to ongoing changes
OO projects: xKalMan++, iPatRec to be integrated into Arve in summer, xKalman++ to be put into repository, and to be documented, new clustering to be used
New muon reconstruction code: integration to be expected in June
Arve: What does it really mean? What does it mean to put software into it?
Access to data: user guide exists, digits from Geant3 (except Tilecal)
Detector description: AMDB exists, requirements needed - wait for nomination of task leaders
Muon identification: convergence needed between muon system and ID, but otherwise well advanced
OO Kalman filtering in L2 trigger software - what is the future of Astra? Need to ensure consistency of geometry
Convergence to common classes like geometry, cluster, space point, track. Track class now proposed, will be documented on the Web, dedicated discussion during next SW workshop
End 99 milestone for reconstruction? ID in good shape, uncertainties on calorimeters and muons

Discussion

Q: Shouldn't one go ahead and create a prototype? A: It exists in the repository
We should decide now who is going to address the Arve questions, and how to go along with the component model - this is a natural topic for the architecture task force
Q: Is it conceivable that Geant4 would store the digits in the same format as Geant3? A: Unclear

Fri 10:30 Topic L. Perini: World-wide computing WG
References N/A
Summary

Introduction to Monarc
Activity to bring together people interested in regional centres, taking into account all boundary conditions. Presentations by representatives of potential regional centres
Interim thinking: RC should provide 10...20% of CERN's resources for the respective experiment, natural limit in number. Service centers are distinct from regional centres, serve eg. MC production
Rough estimates of capacity in 2007: 700 k SpecInt95, 740 TB disks, 3 PB tape for a single experiment. To be understood: networking, role of magnetic tape, relative importance of I/O to computation
IN2P3: at 1/3 of CERN's CPU capacity; plan to stay tuned
INFN: start from scratch, unclear whether there will be more than one centre for the various experiments
US-Atlas and Fermilab planning to cope with numbers projected by Les Robertson
Germany, Japan: discussions ongoing
Simulation: written in Java, implements all basic ingredients (including Objy, tapes, networks, ...) of LHC computing
Testbed for Objectivity over wide area networks being built up in CERN, Japan, Italy; using Atlfast++

Discussion

Q: What are the plans for the accessibility of regional centres? A: Idea is that they should be open to the experiment, possibly with different priorities, but the discussion is not finished yet

Fri 11:15 Topic J. Knobloch: LCB workshop: Atlas feed back
References Slides: HTML, PS, PDF, PPT
Summary

Atlas contributions: Architecture, data persistency: Atlas event scheme, detector description, rapporteur? World-wide computing: Monarc, rapporteur from Atlas? Simulation: we are most advanced users of Geant4, should nominate rapporteur. Analysis tools: Experience from Atlas, Atlas part of Monarc, rapporteur from Atlas?

Fri 11:20 Topic H. Meinhard: Workshop summary
References Slides: HTML, PS, PDF, PPT
Summary (see slides)
Fri 11:50 Topic N. McCubbin: A few preliminary remarks
References N/A
Summary

E-mail: N.McCubbin@rl.ac.uk
Next points assume approval by CB
see CV for brief summary of background
Transition period of Norman: steady state (~ 80%) by October '99
Software effort and OO: serious lack of manpower. OO is hoped to overcome Brookes law (throwing people at a late software project will making it even later)
Attempting encapsulation is not new, but Fortran Common blocks made it difficult
Students come with knowledge of modern methods, and need jobs after HEP
Software MOU: Raises profile of software to the one of detectors
What's new in LHC software? Algorithmic complexity (eg. to fit a track) not much changed, but all the rest (data volumes, CPU power, size of collaboration, precision, software lifetime, computing environment, ...)
Mission of Atlas: physics. Physics and software must be in symbiosis, community (singular!) at large responsible
A lot of pieces out there, architecture task force must address how to organise this, and how to go forward. Suggestions, thoughts, ideas, solutions, ... to Norman
Training: How did YOU learn?

Discussion

We are not in that bad a shape concerning the MOU - funding agencies have been warned. However, the experience with the Geant4 MOU has been painful
Statement that the complexity did not increase challenged
Term 'work unit' needs to be specified, probably much below the MOU level
Q: How will the organisation work in the next six months? A: Don't know in detail yet, starting points are review report and action plan. In transitional period, Jürgen will be very important
Q: Do you agree that the systems should nominate system task leaders before the overall coordinators are known? A: In principle yes, but needs more thinking. Since it is a finite number of people involved, they should be looked at individually
Q: When should the architecture group start? A: As soon as possible, its role is very important
Q: What does architecture encompass? A: Complex issue... but it should facilitate the partitioning of the software (MOU, work units). Not detailed discussion about classes, not hardware architecture
Q: How do conventions about how to build the work package get set up?
Q: How do you see the quality control group, and its relation with the architecture group? A: As soon as possible; Norman does not see a big danger that the two groups will get in their way
If the architecture task force does not define interfaces and glue, it must instantiate a group which does

Day	Time
Mon	11:00	Topic	A. Khodabandeh: Introduction to Configuration Management
References	Slides: HTML, PS, PDF
Summary	Role game: Lamp owner, fan owner, power supply assembly, battery maker to illustrate some of the basic problems Key questions: What are the problems? What are their causes? What can be done to avoid them? What is needed? Full bunch of possible keywords Why configuration management? To know what we have to produce, where it is and in which state, only the right people can use or change it, to understand the impact of changes, to make sure needed information is available, and that agreed procedures are followed Functions: Configuration identification, configuration control, status accounting, configuration auditing Configuration items: Anything that needs to be controlled, need to be identified first. For software: source code, test data and test code, design diagram, documentation, project plan, compilers, libraries... in general, everything the loss of which would seriously affect the project. Need to be put into a hierarchy, need to address the class vs instance question Configuration identification: deals with types of configuration items, organises the structure, naming conventions, version numbering scheme, baseline planning Configuration control: setup library (eg. software repository) with controlled checkin-checkout administering all versions of all configuration items, guaranteed integrity. Reactions on proposed baselines: bug reports, change requests; requires evaluation of importance, relations, and impact. Rejected requests need to be archived What about quick fixes? Must be possible in cases of urgency (eg. DAQ system broken), but proper follow-up (only occurence, best fix, side effects, known problem, ...) is required asap afterwards Configuration auditing: functional audit, physical audit, generate problem report Status accounting: collect data on status of items, provide visibility of these data (automatic reports, answering of queries), notification, right data and right format Where to start? First, does one really want to do configuration management? If so, who will be in charge (software librarian, configuration manager, ...)? What is really needed (size and importance of the project, distributed development, ...)? When is it needed? (Elements of CM can be put in place step by step.) How to implement CM? CM implementation: circular process involving planning, defining procedures, dealing with people, making decisions, automating support, migrating. Tools can help, but they are by far not the only aspect. Tool should be chosen only once it is really clear what one wants to achieve What is being done in HEP: CVS, various SRT flavours, CMT, SCRAM (CMS), SCaM (CERN accelerator sector) To go further: check http://spider.cern.ch/Processes/ConfigurationManagement, with commented pointers to other Web pages, book lists etc. Suggestions for updates to spider@cern.ch
Mon	14:00	Topic	D. Duellmann: Object data bases as data stores for HEP, part II
References	Slides: HTML, PS, PDF, PPT
Summary	Physical and logical model: Federation, databases, containers vs logical view made of objects and associations, allows for optimisations transparent to the user Reminder of present limits in terms of federation sizes etc. Basic architecture: lightweight "servers", hence less scalability problems Examples in /afs/cern.ch/sw/lhcxx,share/HepODBMS/pro/examples First example: populate a database with persistent events (definition of classes, create federation with data bases and containers, create event objects) Class definition: in .ddl files similar to standard C++ header files, need to inherit from persistent base class. Persistent classes cannot contain other persistent classes as data members (however, references are possible), nor C++ pointers or references (C++ pointers to be replaced by data base smart pointers). Additional features of DDLs: variable-length arrays as data members, bi-directional associations, 1-to-N or M-to-N associations. Preprocessor (ooddlx) produces schema source code and header files, and puts schema information into the federation Object browser exists to interactively look at database contents HepODBMS (RD45 development): shielding layer for independence of vendor or release changes, HEP specific high-level classes Example code: 'new' operator allows to specify a clustering hint, centralised treatment of clustering hings in HepODBMS Container limitations: check container size when new object is created, manage a persistent list of containers Persistent analysis objects: LHC++ uses Objy for histograms, tags and event data. OIDs used to directly access objects. Earlier idea: one federation per user, migrating to one federation per experiment Ntuple vs TagDB approach: tags more flexible
Tue	09:00	Topic	J. Knobloch: Introduction, workshop agenda
Summary	Overview of agenda: 2 tutorials, Changes in Atlas Computing, LCB workshop in Marseille, simulation, training, tools Analysis tools workshop Possibly a presentation of the candidate computing coordinator
Tue	09:10	Topic	T. Åkesson: Changes in Atlas Computing
References	Slides: HTML
Summary	Background: Atlas Computing Review, number of findings leading to many recommendations, requesting immediate actions from management. Report presented to EB which took note, management to put forward action plan to the EB View of Atlas management: no technical decisions by management, putting together groups of people spanning different views in Atlas, taking into account work done so far in Atlas and in other experiments. Aiming for a system of institutional commitments Action plan presented in last software workshop, put forward to April EB, approved after some changes, collaboration informed Components of action plan: architecture task force, quality control group, national board, systems responsibles (SW, reconstruction, simulation, data base task leaders), training Some uncertainties during the transition period are unavoidable Mandates and compositions architecture task force and quality control group National board: caring about networks, platforms, regional centres, collaborative tools, ...; composition: divided according to funding agencies, include a small regional centre working group with Monarc participation Training: a global issue Coordinators: computing: responsible for production of software as project leader for core software and as coordinator for detector specific software, Norman McCubbin proposed; physics: setting requirements and verifying performance, Fabiola Gianotti proposed More precise discussion of mandates
Discussion	Q: What is the relationship between the systems task leaders and the architecture task force? A: There is no formal one Q: Is it intended to have Atlas-wide software, reconstruction, simulation, and data base coordinators? A: Presumably yes, the system task leaders will form working groups which will elect chair persons Reconstruction and simulation integration require full-time efforts each Q: What is the role of the regional centres? Are they supposed to contribute to the core software effort? A: This depends on the case, but for some centres, this seems conceivable Q: Discussion seems to assume that CERN contribution to LHC computing is known, and that basic decisions about regional centres have been taken, although this is not the case. A: Yes, that is the clear case for the national board Q: It is necessary now to clarify the role and the status of the overall coordination roles (simulation, reconstruction, data base). A: This is acknowledged, but not too many firm decisions should be made before the computing coordinator is elected. Also, some flexibility in the setup will be required Q: It would have been very advisable to define that existing mandates would extend until further notice. A: The transition period is not really until the new computing coordinator is fully functional, but until the architecture task force has started Q: Some existing groups (graphics, control, analysis tools ...) are not represented in the task forces. Does that mean they are cancelled? A: No, this issue will be dealt with in due time Community wants clear signal that work should continue until indicated otherwise Q: Is there a time scale for putting the new computing organisation in place? What is the procedure for nominations and elections? A: By beginning of September, there should be a clear picture including all essential nominations. System task leaders to be nominated by systems, overall coordinators will be proposed by computing oversight board Q: Is there anybody foreseen with an architectural role? A: This is up to the computing coordinator, but it is conceivable that such a person will be required Q: Architecture is of utmost importance now. What is the mechanism of communication between the architecture task force and the systems? A: That's why the architecture task force should start as soon as possible. Proper communication with the community is vital for the success Q: When is the architecture task force going to get started? A: There are still points of discussion with the CERN group There must be strong links between the system task leaders and the architecture task force Q: If the computing coordinator is at the same time project leader for core software, why hasn't the computing community had a chance to form their opinion and propose somebody? A: The computing coordinator needs to be treated as a coordinator, with one proposal put forward by the management to the CB Q: Is there any contingency plan for the software? Q: What about the end of 99 status report? A: This can only be seen with time, once the structures are in place. We should not be looking into the past too much Q: For the nominations of the coordinators, have the communities in other cases not been consulted as well? A: The collaboration has been asked to give their input, which has been very carefully considered by the management. The central task is to bring the communities working on the new software, and on the physics TDR software, together It would have been wiser to seek the support of the computing community for the nomination For the LHCC status report, advice by the referees should be seeked as to what the report should address The idea of defining work packages is considered very useful There is some hope that people will look at the changes in a positive spirit
Tue	10:50	Topic	M. Stavrianakou: Repository and releases
References	Slides: PS, PDF, SDD
Summary	Production software mostly stable except for reconstruction, Atlfast and applications (being ported to more platforms) C++ software steadily evolving Suggestion to have an overview over existing C++ packages given in one of the next workshops Supported platforms: HP, DEC, IBM, Linux, Solaris; not supported: SGI (some work done in Boston), WNT Releases: roughly fortnightly (26 so far in one year), nightly builds (not really used yet by developers); Fortran software usually builds, problems with new software. Production release planned for June Outstanding issues: generator packages out of date (next physics coordinator to take up), package author list to be updated, move to new CLHEP and CERNLIB to be scheduled, SRT compilation log analyser, releases both in debug and optimised mode, librarian support (deputy) SRT: some improvements required (documentation, functionality - cope with increasing release size, concurrent debug/optimised releases, non- global releases and sharing of binaries). Spider/SRT project stopped by IT division. Maintenance problem - action is needed now
Discussion	Atlas10 (new HPUX 10.20) has got problems linking What about IBM? Not clear whether they will move to the C++ standard After the sudden death of the Spider project, we are to review the situation and evaluate the existing solutions, task force is needed Could try to confront Spider work model with Atlas requirements
Tue	11:10	Topic	D. Rousseau: TDR software and productions
References	Slides: PS, PDF
Summary	Simulation: Dice mostly frozen in February 1998, with the exception of the muons. Number of productions done with obsolete geometry. Dice in cvs, but cvs version not used for production yet. New pile-up method used in calorimeters Reconstruction: common clustering, IPATREC and XKALMAN widely used, PIXLREC less heavily used. XKALMAN++ tested, will be moved to repository. Calorimetry: JetFinder libraries, more detailed output. Muon system: better pattern recognition and fitting for low pt, correct covariance matrix, timing improved Timings: no optimisation for CPU time done yet Combined reconstructions: complex matrix of combinations. Not all algorithms properly integrated in Atrecon because of timing, some just designed to run on the combined N-tuples Vertexing: conversion finding, K0_s and secondary vertex, primary vertex e/gamma identification Combined muon measurement: two approaches: combining muonbox tracks with xkalman tracks; refit of ID hits with muonbox tracks Lessons to learn from CBNT: different usage in groups, both for optimisation of combined performance tools and physics studies. Hitting annoying limit of 50'000 variables, hence careful tuning of contents required. Small size appreciated for export. Interest clearly demonstrated, but solution should be better than Hbook + Paw Productions: simulation: lots of different channels, bottleneck was person power for supervision. Reconstruction: mostly done on private basis, not much centrally organised Conclusions: TDR software ready in time, all holes successfully filled. Next steps (before transition to OO): collect and archive information about productions, collect code so far kept private, get comments of people using the software about their likings and disliking, check DICE CVS version. What about coping with updated geometry: implement that in Dice or move to Geant4 first?
Discussion	A decision needs to be made about the potential DICE geometry update Tendency is to go for a freeze of the Fortran software, unless the new coordinators decide otherwise
Tue	11:50	Topic	H. Meinhard: Platforms, other Focus issues
References	Slides: HTML, PS, PDF, PPT
Summary	Focus: What it is, its mandate, new chairman (Paul Jeffreys) and new secretary (Marco Cattaneo) Widely accepted trend to go for PCs for physics data processing; Linux and NT (the latter mainly for commodity and productivity software) Policy proposal, generally accepted: Improve support for Linux and NT, discourage investments into Risc hardware, commercial Unix O/S to be frozen, end date for support to be determined Other items: Storage management, HSM (continue HPSS, but no firm decision until end 2000), shift software (major revision ongoing); Y2K; new printing service; changes in IT structure Distribution of LHC++ and G4: one compiler and OS per platform for LHC++, taking source code changes for unsupported combinations back into repository, and accepting contributed binaries into standard places. CLHEP released more frequently than the rest of LHC++. Licensing and access restrictions under discussion, will be open (GPL). Geant4 so far only distributed as source tar files, will distribute optimised libraries from next release on
Discussion	Printers: difference with xprint, does xprint still exist Asis: still around in three years? LHC++ distribution via asis? No progress yet
Tue	12:10	Topic	J. Knobloch: LCB workshop
References	Slides: HTML, PS, PDF, PPT
Summary	Date: September 28 till October 1st, place: Marseille (France), Web site: http://marcpl2.in2p3.fr/LCB/. Participants: mainly people from LHC experiments, some from IT, BaBar, Fermilab. No quota Subjects: Architecture (components, design issues), technology tracking (networking, processors, memory, storage), world-wide computing (Monarc, collaborative tools, data management), simulation, analysis tools (available tools, practical experience) Format: Each session introduced by rapporteur summarising the thinking of the experiments, much time devoted to discussions, contributions from community by early July (abstracts and Web links to get the rapporteurs interested) Next: propose rapporteurs and conveners, establish guidelines concerning issues to be treated, propose and prepare Atlas contributions
Wed	09:05	Topic	J. Apostolakis: Geant 4 status and experience
References	Slides: HTML, PS, PDF, PPT
Summary	Version 4.0.0 in December 1998, marking end of R&D; collaboration formed since then Very powerful G4 kernel (tracking, stacks, geometry, hits), physics models, additional capabilities (persistency, visualisation), greatly surpasses Geant 3 Tracking: general and flexible; event: powerful stacking at no extra cost; geometry: hierarchical or flat; voxels for speed; hits: user defined All processes at least at G3 level; hadronic processes: distinguish process and model, models data driven or parametrisation driven Geant 4 experiences: Atlas, CMS (from AIHENP99), Borexino, BarBar (fast simulation) Comparison by CMS of Geant3, Geant4, and test beam data Borexino application of Geant 4: very realistic simulation of the geometry and the processes BaBar: using Geant4 for their fast simulation (simplified geometry), using G4 facility of parametrised processes. Full simulation with G4 under development Benchmarks: focus on EM physics performance, compares speed at constant physics, physics at constant speed, using two configurations (thin silicon, sampling calorimeter). Performance lead of Geant4 over Geant3 Since January 99, urgent patches with fixes released, consolidation release due end May 99 (fixes, minor improvements, few more models, ability to use STL rather than RogueWave); 4.1.0 scheduled for end July 99 (additional physics models, more functional improvements)
Discussion	Q: Are there tools to do pile up studies for hits and/or digis? A: Yes, something was implemented, Makoto Asai is the expert Q: Were there any geometries created from CAD systems? A: Not known
Wed	09:30	Topic	A. dell'Acqua: Detector simulation activities in Atlas
References	Slides: HTML, PS, PDF, PPT
Summary	CHAOS project: Aim to exploit Geant4, to implement the future Atlas simulation program through OO analysis and design, and to embark new people (on Geant4, on C++, on OO) Start with a core group (former G4 members), follow a formal process, iterating through categories, OO design, implementation Prototype work started to get people interested, lots of buglets in Geant4 found Training program set up, simulation group being built. Aim: simulation program which surpasses DICE in functionality, and which is maintainable Categories: concentrating on ChaosSimulationControl, detectorConstruction, detectorDescription Design category by category (example: run and event category) Prototyping going on practically on all subdetector systems (with the exception of EM calo). Muon system almost completely simulated, takes detector description from AMDB, integration of B field classes, tracking in magnetic field. Problems due to a bug in Geant4 (chambers were not transparent...). Hits being implemented now, will use detector description scheme (by RD and Christian), direct reading from AMDB meanwhile. Acceptance studies; medium term: interface to reconstruction Another prototype: silicon tracker (Makoto Asai), rather formal approach involving careful OO design. Parametrisation pushed to its limits Tile cal testbeam: Code written for G4 course, not very well structured, being redesigned, many problems in tracking in Geant 4 found (and meanwhile solved). Adding hits, digits, N-tuple facilities. Putting everything into G4 framework. To be confronted with Geant 3 data Work going on with TRT simulation (Maya), hadron endcap and forward calorimeter (Rachid), TGCs (H. Kurasige) Missing: EM calo; geometry cannot be built the same way as with G3, performance implications to be understood, new functionality for G4 geometry may need to be requested Training: first course given on 16 - 19 February for Tile cal (20 people). Positive feed back, but course was too short. Another course arranged for muon group end June, some participants from ID, aiming at 5 days course Bits and pieces are falling into place, can start building mock-up geometry, need to push for more test beam simulations, give component model a try, improve communication (Web page...), interface with other domains of Atlas OO software (detector description is of utmost urgency), work on generators (Fortran wrapping), simulation jamboree (delayed from May to summer, with new coordinators)
Discussion	Q: What size is the core group? A: About 10 people Q: Is the silicon prototype for the barrel only, or does it include endcaps? A: For the time being, it's barrel only, but with hits and digits Q: What is Momo? A: It's the graphical user interface used in Geant4 Q: Is the testing procedure in Geant 4 adequate, given that such an incredible bug in the tracking could slip through? A: No, it is being improved Q: Do we have a mechanism to decide which of the patches to install? A: For the time being, we install all patches We should absolutely avoid creating another private version... Q: What fraction of the Geant4 code are we testing with our examples? A: We aim at testing most of the physics, and of the geometry. Importing from CAD systems is not yet tested by us. Most other parts have not been tested a lot For the link between detector description and simulation, working meetings need to be organised Q: For how long do we need to maintain Dice? Which version of the geometry should be ported to Geant4? A: The latest Dice geometry should be ported, and Dice development should be stopped, but the changes in ID geometry need to be discussed Q: What is the status of the changes of the pixels? A: There is code for Dice in CMZ, but it has not much been tested Proposed decision: suspend implementation of the changes in Dice unless strong evidence is put forward that it is necessary
Wed	10:25	Topic	M. Stavrianakou: TRT test beam sector prototype simulation with Geant4
References	Slides: PS, PDF, SDD
Summary	5 sectors of 16 planes each of 1 radiator plane and 16 planes of straws each G4 particle gun, all standard G4 physics processes included, problems being investigated. Different TR models to be used once available Hits implemented as simple objects, digits implemented rudimentally Detector geometry debugged (using DAWN and DAVID), tested with tracking (10 k pions), material "measured" by shooting Geantinos (8...10% X0) Results (energy deposit, number of hits) for incident pions look qualitatively correct, more quantitative checking needed by experts. For electrons, there is an unphysical peak in the energy deposit which is being looked at Next steps: Wait for electron bug to be fixed, study electrons with and without TR, improve digitisation, more realistic geometry, investigate fast simulation options, first prototype of persistency, transient and persistent histograms, more complete test beam setup
Discussion	Q: Is the energy deposited in the straws equal to the energy difference between incoming and outgoing particles? A: Not yet checked
Wed	11:00	Topic	C. Onions: Training
References	Slides: HTML, PS, PDF, PPT
Summary	About half of the countries have nominated training contact people already. The US nomination has been withdrawn Course for systems coordinators and task leaders: Course on June 14 - 18 at CERN on Hands-on Analysis, Design and Programming with C++, being filled slowly, but can stand more applications Consultancy: specific names not yet identified, suggestion is to use the mailing list atlas-sw-developers@atlas-lb.cern.ch Other activities: Consultant discussed training needs, awaiting report; discussion on a C++ tutorial series by IT division UCO presence in building 40, including book shop; update list of recommended books, CDs, videos; complete training contact list; fill course Next goals: Identify good physics examples; organise design and code walkthroughs; prepare guidelines for de-centralised training Other courses: Geant4 by Andrea dell'Acqua Training Web page: http://atlasinfo.cern.ch/GROUPS/SOFTWARE/OO/training
Discussion	Marvellous job on C++ and OO, but other areas underrepresented (cvs, SRT, ...) Q: Is there a registration fee for the courses at CERN? A: Yes, it's courses provided by Educational Services who charge 200 CHF / day Q: What is the status of Java? A: Not being pushed for at the moment, no need to make special training efforts. Also, there is the series of IT tutorials on Java
Wed	11:25	Topic	S. Fisher: Case tools for Atlas
References	Slides: HTML, PS, PDF, PPT
Summary	Full story: http://atlasinfo.cern.ch/Atlas/GROUPS/SOFTWARE/OO/tools/case/ Questionnaire prepared for Atlas, few replies received. Invited other experiments to reply. In total, 19 (15 Atlas) replied People want both a simple and a big tool Should not restrict ourselves on one tool Customisation is important Operating systems requested: NT, Linux, Solaris Little difference in what people use Rose and StP for, but greater satisfaction with Rose Best points about Rose and StP: much in common Worst points: Rose: crashes often on Unix, very high memory and CPU consumption; StP: Speed of execution, speed of startup, excentric user interface Rose: only supports commonalities between OMT and Booch, not really full UML - missing activity diagrams, code repository missing StP: ... Evaluation of tools: Looked at 12 tools which support UML on Unix and NT, with C++ support. Details in the Web page To be considered seriously: StP UML 7.1: major improvements, but NT only for the time being, Unix version in September/October. Now real Windows program, uses Sybase or Access. Startup: ~ 3 seconds, but high price to pay in terms of performance for central repository at CERN. Interface greatly improved Very interesting: Together; round-trip is automatic for both Java and C++, model and code are always in step, works actually well. Written in Java. Performance fine on a 450 MHz PII, easy to use, can define own way of mapping UML objects to code. Small text files (no data base), facilitating group work. Code held in memory - need to restrict to a reasonable size package. Scripting with Java or Python. Whiteboard edition is free. Documentation generation excellent. No support for name spaces and nested classes. Summary: superb, reasonably priced tool Argo: free, open source, no C++ support as yet, supports XMI, lots of good ideas in interface, rather buggy, suffers from the slowness of Swing. Could perhaps take off Recommendation: For heavy tool, stay with StP for the moment. As light tools, both Together and Argo are very interesting
Wed	12:10	Topic	S. Fisher: Status of reviews
References	Slides: HTML, PS, PDF, PPT
Summary	Dig decided that ongoing reviews should be completed SRT documentation and design: Waiting for new deliverable Muon code, graphics code both waiting for one reviewer "DG" issues (issues to be brought up later): important requirement and manpower issues. Some came up in SRT review: srt configure rather than autoconf, manual to give advice on sensible use of CVS, missing chapters. Proposal to ask QC group to take these points on
Discussion	Q: What about work finished now? A: No formal decision has been taken; we should encourage to carry on reviewing
Thu	09:10	Topic	M. Stavrianakou: Introduction
References	Slides: PS, PDF, SDD
Summary	(see slides)
Thu	09:20	Topic	S. Fisher: Analysis tools requirements
References	Slides: HTML, PS, PDF, PPT R. Somigliana, K. Sliwa: Data analysis software tools requirements (draft)
Summary	(see slides)
Discussion	We need to become clear what we mean by Analysis One should not try to assign responsibilities too early, it is more important to have a complete view of the requirements Why have we replaced the requirements collected in the graphics domain by something actually worse? We ought to collect more high-level requirements Need to distinguish very clearly between user and system requirements (the latter can be considered methods) Should try and have a common comprehensive set of requirements which can be projected according to the viewpoint Important that requirements are as complete as possible, and ranked according to priority. Every effort must be made to capture the requirements of the end user, should not worry about formalisms of the requirements document too much Q: How do we proceed? A: Input hoped for during this meeting, at the end we could establish a small working group to carry on It would be useful to make sure brainstorming results are not lost The requirements capture phase should not take too long (much less than a year), input solicited from all physics groups and systems Requirements should be testable
Thu	10:00	Topic	L. Tuura: Architectural issues
References	N/A
Summary	Architecture is a moving target, all software is going to be re-written a couple of times. Should not become religious about these issues Important to focus on the target, not on the methods Need to deliver now, even if not perfect Need to collect experience, requirements may be modified with experience with the software Strategic choices to be made: whole chain from event filter to user analysis should be consistent, packages should be movable from one application domain to another; should be as independent on a specific tool as possible; ease of use or experimentation puts requirements on performance; C++, object orientation, component architecture
Discussion	The new architecture task force is the forum to discuss these strategic choices, they must be supported by the whole collaboration Q: Re-writing parts of the software is fine in the component model as far as components themselves are concerned, but what if we want to change the glue between in the components? A: This glue is a very thin layer anyway Can the Event Filter afford objects or components because of their potential overhead? Of course, this is an implementation issue to be watched out Q: Shouldn't we aim for large components in order to reduce the fraction of glue in the system? A: Abstractions should go bottom-up Even Level2 algorithms should run in the same framework
Thu	10:30	Topic	C. Tull: StAF Architecture
References	Slides: HTML, PS, PDF, PPT
Summary	StAF: Standard analysis framework Challenges: data volume, processing power, dispersity of community, time frame of projects Framework: reuse of code that calls application code StAF: horizontal and vertical modularity, explicit graceful retirement, adoption or imitation of industry standards, scripting access to code API, relies on code and doc generation tools, dynamic linking and unlinking, multi-language support Horizontal modularity refers to ASP domains, vertical modularity reflects protocol and interface standards; Corba actually used Objects and packages: ASP (analysis services packages), PAM (physics analysis modules), data objects, error stack, result stack ASP: 1 object factory class, 0 or more worker object classes. Communication via software bus User code: C, C++, Fortran possible Data set access: self-describing data based on XDR and Corba IDL, running on large variety of systems Data navigation and manipulation a la Unix User interface: scripting language preferred over GUI, interface reflects underlying classes Used in Star, Phenix, AGS, Grand Challenge, Clipper Framework approach largely exploited in existing analysis programs, interesting tools and technologies available. Framework and infrastructure must be available early, and long
Discussion	Q: How does StAF relate to the component model? A: The latter is at a lower level Q: Can all modules or components talk to all other? A: Yes, if they have compatible interfaces
Thu	11:25	Topic	RD Schaffer: Data mining for analysis
References	Slides: HTML, PS, PDF, PPT
Summary	Fairly traditional event structure: raw data, event summary data, analysis object data, event tag. Constraints: raw data will mostly remain at CERN, ESD hence needs to contain enough detail to re-create the AOD Logical view: From event header, navigation possible to all parts of the event Event collections: collection holds a set of events within a particular context (selection of events, selection of parts of events) Key point: optimise access and turn around for typical analysis; handles: layered access (tags, aod, esd), organisation (column-wise, indexing schemes, reclustering of sparse events). Need to understand analysis scenarios - what event selections, and what parts of the events, will be accessed most? Current status: fairly early. Event model: access to raw data. Work started on general tools for ESD and AOD. Available from RD45: Event collections, hierarchical naming facilities, lightweight persistency prototype
Discussion	Granularity of data access still needs to be defined, clearly depends on physical parameters of Objectivity, but realistic scenarios for analysis are required For users, not only copying a subset of the event information is required, but the user must also be able to add his own information During analysis, the data base should be updated only in a very well controlled way How is the question of storage vs just-in-time processing addressed?
Thu	11:55	Topic	M. Stavrianakou: Evaluations of tools
References	Slides on existing tools: PS, PDF, SDD Slides on evaluation: PS, PDF, SDD
Summary	Kinds of analyses to be considered: real physics and tst beam analysis, but also online monitoring, calibrations, ... Specific requirements? Scenarios / use cases? Can we extrapolate or interpolate? Physics and detector communities to provide input, via questionnaire Could use questionnaire from D0/CDF on run II data management needs, would have to be enlarged Available tools: CERNLIB components, LHC++ components, ROOT, JAS, OpenScientist, others (HippoDraw, Grace, various commercial tools) How to choose? evaluation according to requirements and use cases, compliance with the architecture, choice of standards, programming languages and technologies, resources, timescale Functionality to be considered: I/O, histogramming, fitting, plotting, ... I/O: Objy/DB, light-weight persistency, ROOT, Root I/O, ... Choice depending on data types, volumes, access and selection patterns. Should be as decoupled from the rest as possible; should be possible to use the appropriate tool for each application, not necessarily always the same one Histogramming: Root, HTL, JAS, OpenScientist, ... Some soon to be interchangable, some offering new functionality. How important is the association to the raw data? "Toupling": HEPODBMS event coolections and tags, Root trees and Ntuples, HepTuple Fitting and minimisation: Minuit, Gemini, NAG, ... Must be interchangeable Plotting: Paw, Root, HepInventor/HepExplorer, JAS, OpenScientist. 2D poorly addressed by commercial tools, PAW/ROOT paradigm still very appealing. Interchangeability highly desirable Interactivity: Paw, Root, HepExplorer, Jas, OpenScientist. Paw/Root paradigm very applealing, HepExplorer not convincing the end user, JAS being considered as alternative, OpenScientist very interesting. Scripting functionality very essential, possibly choice of more than one. Could use SWIG to interface with Perl, Python, Tcl/Tk, ...; CINT debated controversially Usability and performance, modularity and flexibility, maintainability and extensibility, replaceability, restrictions imposed by languages and standards, and by legacy software, resources and timescales Atlas should decide which tools to evaluate, LHC expts or CERN or HEP should make an effort to coordinate analysis tools development work Plans: start review of requirements, start evaluation exercise
Discussion	The problem is that the replies to the questionnaires tend to be biased This discussion is years too late A Hep-wide evaluation and coordination may face serious problems Interoperability is a very desirable aim, but we really need to understand what we are talking about. A Hep-wide standardisation effort would be invaluable Information about physics channels, data volumes etc. still exist from the CTP times, should still be available Q: What is the role of the CBNT? Will it be freezed with the Fortran software? A: Should serve as a starting point for a more conforming, more functional implemenation HEPCCC could also be considered as a body to discuss this question LHCb said to have partly and temporarily accepted the Root I/O scheme Some more perspective into the future, and forward looking judgement, is required
Thu	14:05	Topic	J. Schwindling: MLPFIT: a tool to design and use multi-layer perceptrons
References	Slides: HTML, PS, PDF, PPT
Summary	Artifical neutral networks started in the 40ths, now widely used in many areas. Hep started using them in the late 80ths for classification (particle id, event classificiation, search for Higgs), track reconstruction, trigger, function approximation Multi-layer perceptron: hidden layers between input layer and output layer apply non-linear function to linear combination to layer above them. Output neuron is a linear combination Theorems: this network can achieve function approximation, good discrimination between signal and noise Learning phase: tuning of parameters to minimise a chisquare similar function, requires first derivative of errors Learning methods: stochastic minimisation (linear model with fixed or variable steps), or global minimisation. Stochastic minimisation works sometimes badly on deterministic problems. Hybrid method by far fastest MLPfit designed to be used both for functional approximation and classification. Implementation: 3000 lines of C, precise (double precision), fast, inexpensive (dynamic allocation of memory), easy to use Performance: at least competitive with well known and widely used packages (Jetnet, SNNS) MLPfit reads Ascii file with all parameters, writes Ascii files containing its results, and Paw files. Use dgels from Lapack to solve linear least squares problem MLPfit is callable, parameter passing via dedicated routines; Labview and Paw interfaces available, too Plans: support the code, improve it, test it
Discussion	Q: What about the argument against neural networks that you never find something unexpected, and that you never exactly know what you are doing? A: If it was deterministic, it was linear; one would we missing the additional power of this non-linear approach Q: What about error estimates for the result of a neural network? A: This is being studied by mathematicians, no satisfactory answer yet
Thu	14:50	Topic	D. Malon: Data mining
References	Slides: HTML, PS, PDF, PPT
Summary	What does data mining mean? Examples of standard application domains and questions. Essential characteristics is knowledge discovery, not information retrieval. Commonly emphasis on patterns and models. In HEP: some people mean random access to large amounts of data, others to computationally intensive event-by-event analysis Unlikely that COTS solutions can fulfil our requirements Start from idea of accessing condensed data set on disk, navigating to more comprehensive information on tertiary storage transparently. However, arbitrary random access to tapes is to be discouraged Tag data base: basis for selection Issue has been addressed by DOE Grand Challenge Project; one aim is to provide information about the number of tape mounts to be expected. Possible to coordinate the staging requests from within the running job, and from other requests. Idea is that the exact order of events in processing does not matter (foreach vs. standard loop approach). Easy implementation of parallelism. Corba widely used (can be hidden) Less ambitious models: data trains, carousels: looping continuously through all data, serving events as they are retrieved from the media Preferred strategy is still to produce compressed disk-resident data sets Rough sets being studied
Discussion	Q: Are there implementations of rough sets? A: Begun to build a prototype based on the Star tag database, but no results yet. One has to worry about biases when concentrating on events which are disk resident Once data base has been populated with N-tuples or Atlfast++ files, we could try out some of these concepts
Thu	15:25	Topic	J. Hrivnac: Graphics for analysis
References	Slides: HTML, PS, PDF, AG
Summary	Evolvability: need to ensure that system is future-proof Mission statement of graphics domain Design of graphics made with flexibility in mind - objects should not know that they can be visualised, design independent on any particular visualisation tool High-level graphics design; data input from event domain (Zebra tapes, Objy database), or simulation (MC truth) Use cases: Object browser: tree a la MS Windows Explorer, drag-and-drop functionality; end programmer. Automatic creation of visualisation base classes for any given (non visualisation aware) classes. Scene developer to write two classes, Scene and Rep Democracy of scenes - not all reps need to be implemented for all scenes XML being used for exchange of Ascii information, found to be very useful. Future: further application, more direct mapping between Objy and XML files Current status of graphics: definitions: real and graphical objects, operations on real and visualised objects, views, static objects, streaming objects, volatile objects. Requirements: general, existence, environment, functional requirements. Design done, underwent review. Implementation: ED, modeler, resolver functional, will be upgraded to improve design and functionality; object browser foundation and prototype exist; Tree builder temporarily implemented Views: Event display like: AVRML (3D) fully implemented; Aravis (ramped-up Arve graphics) implemented, reps to be implemented; Atlantis (sophisticated ED) well implemented, problem with data access; Wired (Java based ED working over the Web) well implemented. Statistical/histogram views: HbookTuple implemented, quite obsolete; HistOOgrams simply implemented, quite obsolete; AHTL (HTL histograms) implemented, AOS (OpenScientist histograms) planned, AJas (JAS histograms) planned, Orca (simple integrated environment) basically implemented. Misc.: AsciiText fully implemented, XMLScene implemented, Command Interface to data structures or classes need to be implemented, participation from systems requested Documentation: Implementation guidelines, FAQ, design document, package documentation exist Domain interface: Graphics is what you see on the screen; shows the results of the analysis
Thu	16:00	Topic	S. Resconi: Status of ATLFAST++ in LHC++
References	Slides: HTML, PS, PDF, PPT
Summary	Based on concept of makers, many of which are implemented Main steps done: eliminate dependencies from Root, check results, put data into Objectivity, visualisation with HepExplorer Root dependency: containers replaced by STL and VArray; interface to physics processes rewritten Comparison with Fortran version: checked on sample of 10'000 events for two popular channels Data into Objectivity: following tag/event data model proposed by RD45, one single container, no associations Read data from Objectivity: visualisation with HEPExplorer, perform analysis map. Alternatively, read data with simple C++ program, being used by Monarc - 100'000 events filled (1.7 GB events, 8.4 MB tags) Further refinements needed, still some Root classes in, more physics channels to be checked; can be used to learn more about LHC++ components, and to evaluate analysis tools
Discussion	What is the evolution of Atlfast in general? What should be maintained in the long run? Should the fast simulation not be provided by Geant4? More discussion needed in the physics community Repetition of Atlfast++ tutorial could be very useful
Thu	16:30	Topic	M. Dönszelmann: Java agents
References	Slides: HTML, PS, PDF, PPT
Summary	User requirements: Access to data (may need to move jobs rather than data); access to processing power (1300 machines as of today); analysis jobs need to be restartable/checkpointable; ease of use for physicists System requirements: move jobs rather than data (ship the code securely, ship the state objects); run on multiple CPUs (platform independent, merging of results); checkpointing Mobile agent paradigm: remote procedure calls vs remote programming; the latter can deal with larger chunks. Serialisation and de-serialisation required for transfer of state to another machine Java as implementation of mobile agents: platform independent, secure execution, object serialisation, reflection. Several implementations on the market Mobile and stationary agents - topologies: serial, parallel (merge agent), parallelisation at various level Agents involved: Job, Result, Task, Data Prototype implementation in Cern School of Computing: 35 machines, each with 150 MB data locally stored. Running small program in Java, traversed 5 GB in less than 10 minutes (shipping 5 GB over 10 MB/s takes 1.5 hours) Problems to integrate with Fortran/C++; also not easy to make fail safe
Discussion	Q: What if an agent does not come back? A: There are some ideas how to cope with that, but it is not currently implemented in any of the systems on the market Q: Should we understand the talk as a recommendation to skip C++ and go to Java? A: Yes Q: Would it be possible to cope with the Java/C++ problem by restricting the agents to manager functionality? A: Problem is that the state needs to be preserved Q: In a scenario where platform independence is not important, what is the advantage of the proposed scheme over one with, say, Condor and/or PVM? A: No interception of system calls required when using Java agents
Fri	09:00	Topic	S. Fisher: Status of Spider
References	Slides: HTML, PS, PDF, PPT
Summary	At last meeting, not too much optimism; now, work is suspended. Main problem is idea of common project, not all experiments behind. IPT would like to carry on with willing partners only Two projects: CodeCheck - no action required; no advantage to be gained from Spider at this point in time SRT - work was driven by Atlas, CMS have started their own system. Could do nothing, but in any case, we may need to go ahead without formal interaction with others. Unlikely to have a HEP wide SRT, except via informal contacts Criticism to the existing SRT If we go ahead, create requirements document (have benefitted from contacts with other experiments); get it reviewed (offline and online experts, innocent users); compare it to existing solutions, evaluate them including their cost, next steps depend on the outcome of this process
Discussion	Q: Why don't we go ahead with the current SRT? A: Changes are required, and there are maintenance problems Before we know the requirements, we cannot evaluate the tools Q: What is the time scale? A: Requirements in not much more than a week, review takes a couple of weeks, evaluation again a couple of weeks Q: What does Geant4 and LHC++ use? A: Plain make, this is one of their problems Should not rule out SRT right now, if we need it we will find the manpower Need to take conversion cost into account which are somewhat unclear Would be a pity not to be able to exploit IPT person power For maintenance, should focus on organisations, not persons Questionable whether a short-term evaluation can address all issues of scalability and maintainability Existing commitments and milestones need to be respected Q: What is the functionality issue? A: The main point is that everything is rebuilt for a release Should in any case stick to the idea of online and offline using the same tools Should be discussed at next LCB workshop
Fri	09:40	Topic	RD Schaffer: Report from data base WG
References	N/A
Summary	Topics: production data base, fallback solution, monarc, detector description Production data base workshop: 11/12 May at CERN. Broad variety of solutions exploited now. Agreement of identification scheme for built parts. Recommendations for central and local tools to follow Detector description: exploiting OIF (Object Interchange Format) parser, infrastructure exists, now organising working meetings with systems. Want to go for simplistic Ascii format, not XML right now Simple persistency (also with a possible fallback for Objy in mind): small prototype of persistency tool based on Objy ideas to be released in summer (RD45 meeting in July) Monarc test bed with Objectivity/DB: performance measurements on wide scale distribution of data
Discussion	Q: Why is the simple persistency going on? Are people not satisfied with Objectivity? A: There are several motivations - one is to have something lightweight which could easily be used on a portable, the other is to evaluate the potential cost of a home-grown OODBMS solution. This package is ODMG compliant This is a remarkable change of strategy Good news that risk assessment is taken seriously by RD45, up to LCB to judge whether this is a reasonable strategy, there is a review process in place
Fri	10:00	Topic	J. Hrivnac: Graphics WG
References	N/A
Summary	Discussion about plans and schedules XML proposal
Fri	10:05	Topic	G. Poulard: Reconstruction WG
References	Slides: HTML, PS, PDF, PPT
Summary	Atrecon: huge amount of work for physics TDR, need consolidation (stamped version as physics TDR reference one); difficult to reproduce all physics TDR results. Evolution: keep running version, some improvement requests (detector geometry), adiabatic evolution to C++ New structure and strategy unclear due to ongoing changes OO projects: xKalMan++, iPatRec to be integrated into Arve in summer, xKalman++ to be put into repository, and to be documented, new clustering to be used New muon reconstruction code: integration to be expected in June Arve: What does it really mean? What does it mean to put software into it? Access to data: user guide exists, digits from Geant3 (except Tilecal) Detector description: AMDB exists, requirements needed - wait for nomination of task leaders Muon identification: convergence needed between muon system and ID, but otherwise well advanced OO Kalman filtering in L2 trigger software - what is the future of Astra? Need to ensure consistency of geometry Convergence to common classes like geometry, cluster, space point, track. Track class now proposed, will be documented on the Web, dedicated discussion during next SW workshop End 99 milestone for reconstruction? ID in good shape, uncertainties on calorimeters and muons
Discussion	Q: Shouldn't one go ahead and create a prototype? A: It exists in the repository We should decide now who is going to address the Arve questions, and how to go along with the component model - this is a natural topic for the architecture task force Q: Is it conceivable that Geant4 would store the digits in the same format as Geant3? A: Unclear
Fri	10:30	Topic	L. Perini: World-wide computing WG
References	N/A
Summary	Introduction to Monarc Activity to bring together people interested in regional centres, taking into account all boundary conditions. Presentations by representatives of potential regional centres Interim thinking: RC should provide 10...20% of CERN's resources for the respective experiment, natural limit in number. Service centers are distinct from regional centres, serve eg. MC production Rough estimates of capacity in 2007: 700 k SpecInt95, 740 TB disks, 3 PB tape for a single experiment. To be understood: networking, role of magnetic tape, relative importance of I/O to computation IN2P3: at 1/3 of CERN's CPU capacity; plan to stay tuned INFN: start from scratch, unclear whether there will be more than one centre for the various experiments US-Atlas and Fermilab planning to cope with numbers projected by Les Robertson Germany, Japan: discussions ongoing Simulation: written in Java, implements all basic ingredients (including Objy, tapes, networks, ...) of LHC computing Testbed for Objectivity over wide area networks being built up in CERN, Japan, Italy; using Atlfast++
Discussion	Q: What are the plans for the accessibility of regional centres? A: Idea is that they should be open to the experiment, possibly with different priorities, but the discussion is not finished yet
Fri	11:15	Topic	J. Knobloch: LCB workshop: Atlas feed back
References	Slides: HTML, PS, PDF, PPT
Summary	Atlas contributions: Architecture, data persistency: Atlas event scheme, detector description, rapporteur? World-wide computing: Monarc, rapporteur from Atlas? Simulation: we are most advanced users of Geant4, should nominate rapporteur. Analysis tools: Experience from Atlas, Atlas part of Monarc, rapporteur from Atlas?
Fri	11:20	Topic	H. Meinhard: Workshop summary
References	Slides: HTML, PS, PDF, PPT
Summary	(see slides)
Fri	11:50	Topic	N. McCubbin: A few preliminary remarks
References	N/A
Summary	E-mail: N.McCubbin@rl.ac.uk Next points assume approval by CB see CV for brief summary of background Transition period of Norman: steady state (~ 80%) by October '99 Software effort and OO: serious lack of manpower. OO is hoped to overcome Brookes law (throwing people at a late software project will making it even later) Attempting encapsulation is not new, but Fortran Common blocks made it difficult Students come with knowledge of modern methods, and need jobs after HEP Software MOU: Raises profile of software to the one of detectors What's new in LHC software? Algorithmic complexity (eg. to fit a track) not much changed, but all the rest (data volumes, CPU power, size of collaboration, precision, software lifetime, computing environment, ...) Mission of Atlas: physics. Physics and software must be in symbiosis, community (singular!) at large responsible A lot of pieces out there, architecture task force must address how to organise this, and how to go forward. Suggestions, thoughts, ideas, solutions, ... to Norman Training: How did YOU learn?
Discussion	We are not in that bad a shape concerning the MOU - funding agencies have been warned. However, the experience with the Geant4 MOU has been painful Statement that the complexity did not increase challenged Term 'work unit' needs to be specified, probably much below the MOU level Q: How will the organisation work in the next six months? A: Don't know in detail yet, starting points are review report and action plan. In transitional period, Jürgen will be very important Q: Do you agree that the systems should nominate system task leaders before the overall coordinators are known? A: In principle yes, but needs more thinking. Since it is a finite number of people involved, they should be looked at individually Q: When should the architecture group start? A: As soon as possible, its role is very important Q: What does architecture encompass? A: Complex issue... but it should facilitate the partitioning of the software (MOU, work units). Not detailed discussion about classes, not hardware architecture Q: How do conventions about how to build the work package get set up? Q: How do you see the quality control group, and its relation with the architecture group? A: As soon as possible; Norman does not see a big danger that the two groups will get in their way If the architecture task force does not define interfaces and glue, it must instantiate a group which does

Helge Meinhard / May 1999
Last update: $Id: minutes.html,v 1.8 1999/08/23 11:31:58 helge Exp $