Day
| Time
|
|
|
Mon
| 11:00
| Topic
| A. Khodabandeh: Introduction to Configuration
Management
|
References
| Slides: HTML,
PS,
PDF
|
Summary
|
- Role game: Lamp owner, fan owner, power supply assembly, battery
maker to illustrate some of the basic problems
- Key questions: What are the problems? What are their causes? What
can be done to avoid them?
- What is needed? Full bunch of possible keywords
- Why configuration management? To know what we have to produce, where
it is and in which state, only the right people can use or change it,
to understand the impact of changes, to make sure needed information
is available, and that agreed procedures are followed
- Functions: Configuration identification, configuration control,
status accounting, configuration auditing
- Configuration items: Anything that needs to be controlled, need to
be identified first. For software: source code, test data and test
code, design diagram, documentation, project plan, compilers,
libraries... in general, everything the loss of which would seriously
affect the project. Need to be put into a hierarchy, need to address
the class vs instance question
- Configuration identification: deals with types of configuration items,
organises the structure, naming conventions, version numbering scheme,
baseline planning
- Configuration control: setup library (eg. software repository) with
controlled checkin-checkout administering all versions of all
configuration items, guaranteed integrity. Reactions on proposed
baselines: bug reports, change requests; requires evaluation of
importance, relations, and impact. Rejected requests need to be
archived
- What about quick fixes? Must be possible in cases of urgency (eg. DAQ
system broken), but proper follow-up (only occurence, best fix,
side effects, known problem, ...) is required asap afterwards
- Configuration auditing: functional audit, physical audit, generate
problem report
- Status accounting: collect data on status of items, provide visibility
of these data (automatic reports, answering of queries), notification,
right data and right format
- Where to start? First, does one really want to do configuration
management? If so, who will be in charge (software librarian,
configuration manager, ...)? What is really needed (size and
importance of the project, distributed development, ...)? When is it
needed? (Elements of CM can be put in place step by step.) How to
implement CM?
- CM implementation: circular process involving planning, defining
procedures, dealing with people, making decisions, automating support,
migrating. Tools can help, but they are by far not the only aspect.
Tool should be chosen only once it is really clear what one wants to
achieve
- What is being done in HEP: CVS, various SRT flavours, CMT, SCRAM (CMS),
SCaM (CERN accelerator sector)
- To go further: check
http://spider.cern.ch/Processes/ConfigurationManagement, with commented pointers
to other Web pages, book lists etc. Suggestions for updates to
spider@cern.ch
|
Mon
| 14:00
| Topic
| D. Duellmann: Object data bases as data stores for HEP,
part II
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- Physical and logical model: Federation, databases, containers vs
logical view made of objects and associations, allows for optimisations
transparent to the user
- Reminder of present limits in terms of federation sizes etc.
- Basic architecture: lightweight "servers", hence less scalability
problems
- Examples in /afs/cern.ch/sw/lhcxx,share/HepODBMS/pro/examples
- First example: populate a database with persistent events (definition
of classes, create federation with data bases and containers, create
event objects)
- Class definition: in .ddl files similar to standard C++ header files,
need to inherit from persistent base class. Persistent classes
cannot contain other persistent classes as data members (however,
references are possible), nor C++ pointers or references (C++ pointers
to be replaced by data base smart pointers). Additional features of
DDLs: variable-length arrays as data members, bi-directional
associations, 1-to-N or M-to-N associations. Preprocessor (ooddlx)
produces schema source code and header files, and puts schema
information into the federation
- Object browser exists to interactively look at database contents
- HepODBMS (RD45 development): shielding layer for independence of vendor
or release changes, HEP specific high-level classes
- Example code: 'new' operator allows to specify a clustering hint,
centralised treatment of clustering hings in HepODBMS
- Container limitations: check container size when new object is created,
manage a persistent list of containers
- Persistent analysis objects: LHC++ uses Objy for histograms, tags and
event data. OIDs used to directly access objects. Earlier idea: one
federation per user, migrating to one federation per experiment
- Ntuple vs TagDB approach: tags more flexible
|
Tue
| 09:00
| Topic
| J. Knobloch: Introduction, workshop agenda
|
Summary
|
- Overview of agenda: 2 tutorials, Changes in Atlas Computing, LCB
workshop in Marseille, simulation, training, tools
- Analysis tools workshop
- Possibly a presentation of the candidate computing coordinator
|
Tue
| 09:10
| Topic
| T. Åkesson: Changes in Atlas Computing
|
References
| Slides: HTML
|
Summary
|
- Background: Atlas Computing Review, number of findings leading to
many recommendations, requesting immediate actions from management.
Report presented to EB which took note, management to put forward
action plan to the EB
- View of Atlas management: no technical decisions by management, putting
together groups of people spanning different views in Atlas, taking
into account work done so far in Atlas and in other experiments. Aiming
for a system of institutional commitments
- Action plan presented in last software workshop, put forward to April
EB, approved after some changes, collaboration informed
- Components of action plan: architecture task force, quality control
group, national board, systems responsibles (SW, reconstruction,
simulation, data base task leaders), training
- Some uncertainties during the transition period are unavoidable
- Mandates and compositions architecture task force and quality control
group
- National board: caring about networks, platforms, regional centres,
collaborative tools, ...; composition: divided according to funding
agencies, include a small regional centre working group with Monarc
participation
- Training: a global issue
- Coordinators: computing: responsible for production of software as
project leader for core software and as coordinator for detector
specific software, Norman McCubbin proposed; physics: setting
requirements and verifying performance, Fabiola Gianotti proposed
- More precise discussion of mandates
|
Discussion
|
- Q: What is the relationship between the systems task leaders and the
architecture task force? A: There is no formal one
- Q: Is it intended to have Atlas-wide software, reconstruction,
simulation, and data base coordinators? A: Presumably yes, the system
task leaders will form working groups which will elect chair persons
- Reconstruction and simulation integration require full-time efforts
each
- Q: What is the role of the regional centres? Are they supposed to
contribute to the core software effort? A: This depends on the case,
but for some centres, this seems conceivable
- Q: Discussion seems to assume that CERN contribution to LHC computing
is known, and that basic decisions about regional centres have been
taken, although this is not the case. A: Yes, that is the clear case
for the national board
- Q: It is necessary now to clarify the role and the status of the
overall coordination roles (simulation, reconstruction, data base).
A: This is acknowledged, but not too many firm decisions should be made
before the computing coordinator is elected. Also, some flexibility in
the setup will be required
- Q: It would have been very advisable to define that existing mandates
would extend until further notice. A: The transition period is not
really until the new computing coordinator is fully functional, but
until the architecture task force has started
- Q: Some existing groups (graphics, control, analysis tools ...) are
not represented in the task forces. Does that mean they are cancelled?
A: No, this issue will be dealt with in due time
- Community wants clear signal that work should continue until indicated
otherwise
- Q: Is there a time scale for putting the new computing organisation in
place? What is the procedure for nominations and elections? A: By
beginning of September, there should be a clear picture including all
essential nominations. System task leaders to be nominated by systems,
overall coordinators will be proposed by computing oversight board
- Q: Is there anybody foreseen with an architectural role? A: This is
up to the computing coordinator, but it is conceivable that such a
person will be required
- Q: Architecture is of utmost importance now. What is the mechanism of
communication between the architecture task force and the systems?
A: That's why the architecture task force should start as soon as
possible. Proper communication with the community is vital for the
success
- Q: When is the architecture task force going to get started? A: There
are still points of discussion with the CERN group
- There must be strong links between the system task leaders and
the architecture task force
- Q: If the computing coordinator is at the same time project leader for
core software, why hasn't the computing community had a chance to form
their opinion and propose somebody? A: The computing coordinator needs
to be treated as a coordinator, with one proposal put forward by the
management to the CB
- Q: Is there any contingency plan for the software?
- Q: What about the end of 99 status report? A: This can only be seen
with time, once the structures are in place. We should not be looking
into the past too much
- Q: For the nominations of the coordinators, have the communities in
other cases not been consulted as well? A: The collaboration has been
asked to give their input, which has been very carefully considered by
the management. The central task is to bring the communities working on
the new software, and on the physics TDR software, together
- It would have been wiser to seek the support of the computing community
for the nomination
- For the LHCC status report, advice by the referees should be seeked as
to what the report should address
- The idea of defining work packages is considered very useful
- There is some hope that people will look at the changes in a positive
spirit
|
Tue
| 10:50
| Topic
| M. Stavrianakou: Repository and releases
|
References
| Slides: PS,
PDF,
SDD
|
Summary
|
- Production software mostly stable except for reconstruction, Atlfast
and applications (being ported to more platforms)
- C++ software steadily evolving
- Suggestion to have an overview over existing C++ packages given in one
of the next workshops
- Supported platforms: HP, DEC, IBM, Linux, Solaris; not supported:
SGI (some work done in Boston), WNT
- Releases: roughly fortnightly (26 so far in one year), nightly builds
(not really used yet by developers); Fortran software usually builds,
problems with new software. Production release planned for June
- Outstanding issues: generator packages out of date (next physics
coordinator to take up), package author list to be updated, move
to new CLHEP and CERNLIB to be scheduled, SRT compilation log analyser,
releases both in debug and optimised mode, librarian support (deputy)
- SRT: some improvements required (documentation, functionality - cope
with increasing release size, concurrent debug/optimised releases, non-
global releases and sharing of binaries). Spider/SRT project stopped by
IT division. Maintenance problem - action is needed now
|
Discussion
|
- Atlas10 (new HPUX 10.20) has got problems linking
- What about IBM? Not clear whether they will move to the C++ standard
- After the sudden death of the Spider project, we are to review the
situation and evaluate the existing solutions, task force is needed
- Could try to confront Spider work model with Atlas requirements
|
Tue
| 11:10
| Topic
| D. Rousseau: TDR software and productions
|
References
| Slides: PS,
PDF
|
Summary
|
- Simulation: Dice mostly frozen in February 1998, with the exception of
the muons. Number of productions done with obsolete geometry. Dice
in cvs, but cvs version not used for production yet. New pile-up method
used in calorimeters
- Reconstruction: common clustering, IPATREC and XKALMAN widely used,
PIXLREC less heavily used. XKALMAN++ tested, will be moved to
repository. Calorimetry: JetFinder libraries, more detailed output.
Muon system: better pattern recognition and fitting for low pt,
correct covariance matrix, timing improved
- Timings: no optimisation for CPU time done yet
- Combined reconstructions: complex matrix of combinations. Not all
algorithms properly integrated in Atrecon because of timing, some
just designed to run on the combined N-tuples
- Vertexing: conversion finding, K0_s and secondary vertex, primary
vertex
- e/gamma identification
- Combined muon measurement: two approaches: combining muonbox tracks
with xkalman tracks; refit of ID hits with muonbox tracks
- Lessons to learn from CBNT: different usage in groups, both for
optimisation of combined performance tools and physics studies. Hitting
annoying limit of 50'000 variables, hence careful tuning of contents
required. Small size appreciated for export. Interest clearly
demonstrated, but solution should be better than Hbook + Paw
- Productions: simulation: lots of different channels, bottleneck was
person power for supervision. Reconstruction: mostly done on private
basis, not much centrally organised
- Conclusions: TDR software ready in time, all holes successfully filled.
Next steps (before transition to OO): collect and archive information
about productions, collect code so far kept private, get comments of
people using the software about their likings and disliking, check
DICE CVS version. What about coping with updated geometry: implement
that in Dice or move to Geant4 first?
|
Discussion
|
- A decision needs to be made about the potential DICE geometry update
- Tendency is to go for a freeze of the Fortran software, unless the new
coordinators decide otherwise
|
Tue
| 11:50
| Topic
| H. Meinhard: Platforms, other Focus issues
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- Focus: What it is, its mandate, new chairman (Paul Jeffreys) and
new secretary (Marco Cattaneo)
- Widely accepted trend to go for PCs for physics data processing;
Linux and NT (the latter mainly for commodity and productivity
software)
- Policy proposal, generally accepted: Improve support for Linux and
NT, discourage investments into Risc hardware, commercial Unix O/S
to be frozen, end date for support to be determined
- Other items: Storage management, HSM (continue HPSS, but no firm
decision until end 2000), shift software (major revision ongoing);
Y2K; new printing service; changes in IT structure
- Distribution of LHC++ and G4: one compiler and OS per platform for
LHC++, taking source code changes for unsupported combinations back
into repository, and accepting contributed binaries into standard
places. CLHEP released more frequently than the rest of LHC++.
Licensing and access restrictions under discussion, will be open
(GPL). Geant4 so far only distributed as source tar files, will
distribute optimised libraries from next release on
|
Discussion
|
- Printers: difference with xprint, does xprint still exist
- Asis: still around in three years?
- LHC++ distribution via asis? No progress yet
|
Tue
| 12:10
| Topic
| J. Knobloch: LCB workshop
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- Date: September 28 till October 1st, place: Marseille (France),
Web site: http://marcpl2.in2p3.fr/LCB/. Participants: mainly people
from LHC experiments, some from IT, BaBar, Fermilab. No quota
- Subjects: Architecture (components, design issues), technology
tracking (networking, processors, memory, storage), world-wide
computing (Monarc, collaborative tools, data management), simulation,
analysis tools (available tools, practical experience)
- Format: Each session introduced by rapporteur summarising the thinking
of the experiments, much time devoted to discussions, contributions
from community by early July (abstracts and Web links to get the
rapporteurs interested)
- Next: propose rapporteurs and conveners, establish guidelines
concerning issues to be treated, propose and prepare Atlas
contributions
|
Wed
| 09:05
| Topic
| J. Apostolakis: Geant 4 status and experience
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- Version 4.0.0 in December 1998, marking end of R&D; collaboration
formed since then
- Very powerful G4 kernel (tracking, stacks, geometry, hits), physics
models, additional capabilities (persistency, visualisation),
greatly surpasses Geant 3
- Tracking: general and flexible; event: powerful stacking at no extra
cost; geometry: hierarchical or flat; voxels for speed; hits: user
defined
- All processes at least at G3 level; hadronic processes: distinguish
process and model, models data driven or parametrisation driven
- Geant 4 experiences: Atlas, CMS (from AIHENP99), Borexino, BarBar (fast
simulation)
- Comparison by CMS of Geant3, Geant4, and test beam data
- Borexino application of Geant 4: very realistic simulation of the
geometry and the processes
- BaBar: using Geant4 for their fast simulation (simplified geometry),
using G4 facility of parametrised processes. Full simulation with
G4 under development
- Benchmarks: focus on EM physics performance, compares speed at constant
physics, physics at constant speed, using two configurations (thin
silicon, sampling calorimeter). Performance lead of Geant4 over Geant3
- Since January 99, urgent patches with fixes released, consolidation
release due end May 99 (fixes, minor improvements, few more models,
ability to use STL rather than RogueWave); 4.1.0 scheduled for end
July 99 (additional physics models, more functional improvements)
|
Discussion
|
- Q: Are there tools to do pile up studies for hits and/or digis? A:
Yes, something was implemented, Makoto Asai is the expert
- Q: Were there any geometries created from CAD systems? A: Not known
|
Wed
| 09:30
| Topic
| A. dell'Acqua: Detector simulation activities in Atlas
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- CHAOS project: Aim to exploit Geant4, to implement the future Atlas
simulation program through OO analysis and design, and to embark new
people (on Geant4, on C++, on OO)
- Start with a core group (former G4 members), follow a formal process,
iterating through categories, OO design, implementation
- Prototype work started to get people interested, lots of buglets in
Geant4 found
- Training program set up, simulation group being built. Aim: simulation
program which surpasses DICE in functionality, and which is
maintainable
- Categories: concentrating on ChaosSimulationControl,
detectorConstruction, detectorDescription
- Design category by category (example: run and event category)
- Prototyping going on practically on all subdetector systems (with the
exception of EM calo). Muon system almost completely simulated, takes
detector description from AMDB, integration of B field classes,
tracking in magnetic field. Problems due to a bug in Geant4 (chambers
were not transparent...). Hits being implemented now, will use
detector description scheme (by RD and Christian), direct reading from
AMDB meanwhile. Acceptance studies; medium term: interface to
reconstruction
- Another prototype: silicon tracker (Makoto Asai), rather formal
approach involving careful OO design. Parametrisation pushed to its
limits
- Tile cal testbeam: Code written for G4 course, not very well
structured, being redesigned, many problems in tracking in Geant 4
found (and meanwhile solved). Adding hits, digits, N-tuple facilities.
Putting everything into G4 framework. To be confronted with Geant 3
data
- Work going on with TRT simulation (Maya), hadron endcap and forward
calorimeter (Rachid), TGCs (H. Kurasige)
- Missing: EM calo; geometry cannot be built the same way as with G3,
performance implications to be understood, new functionality for G4
geometry may need to be requested
- Training: first course given on 16 - 19 February for Tile cal (20
people). Positive feed back, but course was too short. Another course
arranged for muon group end June, some participants from ID, aiming
at 5 days course
- Bits and pieces are falling into place, can start building mock-up
geometry, need to push for more test beam simulations, give component
model a try, improve communication (Web page...), interface with other
domains of Atlas OO software (detector description is of utmost
urgency), work on generators (Fortran wrapping), simulation jamboree
(delayed from May to summer, with new coordinators)
|
Discussion
|
- Q: What size is the core group? A: About 10 people
- Q: Is the silicon prototype for the barrel only, or does it include
endcaps? A: For the time being, it's barrel only, but with hits and
digits
- Q: What is Momo? A: It's the graphical user interface used in Geant4
- Q: Is the testing procedure in Geant 4 adequate, given that such an
incredible bug in the tracking could slip through? A: No, it is being
improved
- Q: Do we have a mechanism to decide which of the patches to install?
A: For the time being, we install all patches
- We should absolutely avoid creating another private version...
- Q: What fraction of the Geant4 code are we testing with our examples?
A: We aim at testing most of the physics, and of the geometry.
Importing from CAD systems is not yet tested by us. Most other parts
have not been tested a lot
- For the link between detector description and simulation, working
meetings need to be organised
- Q: For how long do we need to maintain Dice? Which version of the
geometry should be ported to Geant4? A: The latest Dice geometry
should be ported, and Dice development should be stopped, but the
changes in ID geometry need to be discussed
- Q: What is the status of the changes of the pixels? A: There is code
for Dice in CMZ, but it has not much been tested
- Proposed decision: suspend implementation of the changes in Dice unless
strong evidence is put forward that it is necessary
|
Wed
| 10:25
| Topic
| M. Stavrianakou: TRT test beam sector prototype
simulation with Geant4
|
References
| Slides: PS,
PDF,
SDD
|
Summary
|
- 5 sectors of 16 planes each of 1 radiator plane and 16 planes of straws
each
- G4 particle gun, all standard G4 physics processes included, problems
being investigated. Different TR models to be used once available
- Hits implemented as simple objects, digits implemented rudimentally
- Detector geometry debugged (using DAWN and DAVID), tested with tracking
(10 k pions), material "measured" by shooting Geantinos (8...10% X0)
- Results (energy deposit, number of hits) for incident pions look
qualitatively correct, more quantitative checking needed by experts.
For electrons, there is an unphysical peak in the energy deposit which
is being looked at
- Next steps: Wait for electron bug to be fixed, study electrons with and
without TR, improve digitisation, more realistic geometry, investigate
fast simulation options, first prototype of persistency, transient and
persistent histograms, more complete test beam setup
|
Discussion
|
- Q: Is the energy deposited in the straws equal to the energy difference
between incoming and outgoing particles? A: Not yet checked
|
Wed
| 11:00
| Topic
| C. Onions: Training
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- About half of the countries have nominated training contact people
already. The US nomination has been withdrawn
- Course for systems coordinators and task leaders: Course on June 14 -
18 at CERN on Hands-on Analysis, Design and Programming with C++, being
filled slowly, but can stand more applications
- Consultancy: specific names not yet identified, suggestion is to use
the mailing list atlas-sw-developers@atlas-lb.cern.ch
- Other activities: Consultant discussed training needs, awaiting report;
discussion on a C++ tutorial series by IT division
- UCO presence in building 40, including book shop; update list of
recommended books, CDs, videos; complete training contact list; fill
course
- Next goals: Identify good physics examples; organise design and code
walkthroughs; prepare guidelines for de-centralised training
- Other courses: Geant4 by Andrea dell'Acqua
- Training Web page: http://atlasinfo.cern.ch/GROUPS/SOFTWARE/OO/training
|
Discussion
|
- Marvellous job on C++ and OO, but other areas underrepresented (cvs,
SRT, ...)
- Q: Is there a registration fee for the courses at CERN? A: Yes, it's
courses provided by Educational Services who charge 200 CHF / day
- Q: What is the status of Java? A: Not being pushed for at the moment,
no need to make special training efforts. Also, there is the series
of IT tutorials on Java
|
Wed
| 11:25
| Topic
| S. Fisher: Case tools for Atlas
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- Full story:
http://atlasinfo.cern.ch/Atlas/GROUPS/SOFTWARE/OO/tools/case/
- Questionnaire prepared for Atlas, few replies received. Invited other
experiments to reply. In total, 19 (15 Atlas) replied
- People want both a simple and a big tool
- Should not restrict ourselves on one tool
- Customisation is important
- Operating systems requested: NT, Linux, Solaris
- Little difference in what people use Rose and StP for, but greater
satisfaction with Rose
- Best points about Rose and StP: much in common
- Worst points: Rose: crashes often on Unix, very high memory and CPU
consumption; StP: Speed of execution, speed of startup, excentric user
interface
- Rose: only supports commonalities between OMT and Booch, not really
full UML - missing activity diagrams, code repository missing
- StP: ...
- Evaluation of tools: Looked at 12 tools which support UML on Unix and
NT, with C++ support. Details in the Web page
- To be considered seriously: StP UML 7.1: major improvements, but NT
only for the time being, Unix version in September/October. Now real
Windows program, uses Sybase or Access. Startup: ~ 3 seconds, but high
price to pay in terms of performance for central repository at CERN.
Interface greatly improved
- Very interesting: Together; round-trip is automatic for both Java and
C++, model and code are always in step, works actually well. Written
in Java.
Performance fine on a 450 MHz PII, easy to use, can define own way of
mapping UML objects to code. Small text files (no data base),
facilitating group work. Code held in memory - need to restrict to
a reasonable size package. Scripting with Java or Python. Whiteboard
edition is free. Documentation generation excellent. No support for
name spaces and nested classes. Summary: superb, reasonably priced
tool
- Argo: free, open source, no C++ support as yet, supports XMI, lots of
good ideas in interface, rather buggy, suffers from the slowness of
Swing. Could perhaps take off
- Recommendation: For heavy tool, stay with StP for the moment. As light
tools, both Together and Argo are very interesting
|
Wed
| 12:10
| Topic
| S. Fisher: Status of reviews
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- Dig decided that ongoing reviews should be completed
- SRT documentation and design: Waiting for new deliverable
- Muon code, graphics code both waiting for one reviewer
- "DG" issues (issues to be brought up later): important requirement and
manpower issues. Some came up in SRT review: srt configure rather than
autoconf, manual to give advice on sensible use of CVS, missing
chapters. Proposal to ask QC group to take these points on
|
Discussion
|
- Q: What about work finished now? A: No formal decision has been taken;
we should encourage to carry on reviewing
|
Thu
| 09:10
| Topic
| M. Stavrianakou: Introduction
|
References
| Slides: PS,
PDF,
SDD
|
Summary
| (see slides)
|
Thu
| 09:20
| Topic
| S. Fisher: Analysis tools requirements
|
References
| Slides: HTML,
PS,
PDF,
PPT
R. Somigliana, K. Sliwa: Data analysis
software tools requirements (draft)
|
Summary
| (see slides)
|
Discussion
|
- We need to become clear what we mean by Analysis
- One should not try to assign responsibilities too early, it is more
important to have a complete view of the requirements
- Why have we replaced the requirements collected in the graphics domain
by something actually worse? We ought to collect more high-level
requirements
- Need to distinguish very clearly between user and system requirements
(the latter can be considered methods)
- Should try and have a common comprehensive set of requirements which
can be projected according to the viewpoint
- Important that requirements are as complete as possible, and ranked
according to priority. Every effort must be made to capture the
requirements of the end user, should not worry about formalisms of the
requirements document too much
- Q: How do we proceed? A: Input hoped for during this meeting, at the
end we could establish a small working group to carry on
- It would be useful to make sure brainstorming results are not lost
- The requirements capture phase should not take too long (much less
than a year), input solicited from all physics groups and systems
- Requirements should be testable
|
Thu
| 10:00
| Topic
| L. Tuura: Architectural issues
|
References
| N/A
|
Summary
|
- Architecture is a moving target, all software is going to be re-written
a couple of times. Should not become religious about these issues
- Important to focus on the target, not on the methods
- Need to deliver now, even if not perfect
- Need to collect experience, requirements may be modified with
experience with the software
- Strategic choices to be made: whole chain from event filter to user
analysis should be consistent, packages should be movable from one
application domain to another; should be as independent on a specific
tool as possible; ease of use or experimentation puts requirements on
performance; C++, object orientation, component architecture
|
Discussion
|
- The new architecture task force is the forum to discuss these strategic
choices, they must be supported by the whole collaboration
- Q: Re-writing parts of the software is fine in the component model as
far as components themselves are concerned, but what if we want to
change the glue between in the components? A: This glue is a very thin
layer anyway
- Can the Event Filter afford objects or components because of their
potential overhead? Of course, this is an implementation issue to be
watched out
- Q: Shouldn't we aim for large components in order to reduce the
fraction of glue in the system? A: Abstractions should go bottom-up
- Even Level2 algorithms should run in the same framework
|
Thu
| 10:30
| Topic
| C. Tull: StAF Architecture
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- StAF: Standard analysis framework
- Challenges: data volume, processing power, dispersity of community,
time frame of projects
- Framework: reuse of code that calls application code
- StAF: horizontal and vertical modularity, explicit graceful retirement,
adoption or imitation of industry standards, scripting access to code
API, relies on code and doc generation tools, dynamic linking and
unlinking, multi-language support
- Horizontal modularity refers to ASP domains, vertical modularity
reflects protocol and interface standards; Corba actually used
- Objects and packages: ASP (analysis services packages), PAM (physics
analysis modules), data objects, error stack, result stack
- ASP: 1 object factory class, 0 or more worker object classes.
Communication via software bus
- User code: C, C++, Fortran possible
- Data set access: self-describing data based on XDR and Corba IDL,
running on large variety of systems
- Data navigation and manipulation a la Unix
- User interface: scripting language preferred over GUI, interface
reflects underlying classes
- Used in Star, Phenix, AGS, Grand Challenge, Clipper
- Framework approach largely exploited in existing analysis programs,
interesting tools and technologies available. Framework and
infrastructure must be available early, and long
|
Discussion
|
- Q: How does StAF relate to the component model? A: The latter is at a
lower level
- Q: Can all modules or components talk to all other? A: Yes, if they
have compatible interfaces
|
Thu
| 11:25
| Topic
| RD Schaffer: Data mining for analysis
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- Fairly traditional event structure: raw data, event summary data,
analysis object data, event tag. Constraints: raw data will mostly
remain at CERN, ESD hence needs to contain enough detail to re-create
the AOD
- Logical view: From event header, navigation possible to all parts of
the event
- Event collections: collection holds a set of events within a particular
context (selection of events, selection of parts of events)
- Key point: optimise access and turn around for typical analysis;
handles: layered access (tags, aod, esd), organisation (column-wise,
indexing schemes, reclustering of sparse events). Need to understand
analysis scenarios - what event selections, and what parts of the
events, will be accessed most?
- Current status: fairly early. Event model: access to raw data. Work
started on general tools for ESD and AOD. Available from RD45: Event
collections, hierarchical naming facilities, lightweight persistency
prototype
|
Discussion
|
- Granularity of data access still needs to be defined, clearly depends
on physical parameters of Objectivity, but realistic scenarios for
analysis are required
- For users, not only copying a subset of the event information is
required, but the user must also be able to add his own information
- During analysis, the data base should be updated only in a very well
controlled way
- How is the question of storage vs just-in-time processing addressed?
|
Thu
| 11:55
| Topic
| M. Stavrianakou: Evaluations of tools
|
References
| Slides on existing tools:
PS,
PDF,
SDD
Slides on evaluation:
PS,
PDF,
SDD
|
Summary
|
- Kinds of analyses to be considered: real physics and tst beam analysis,
but also online monitoring, calibrations, ... Specific requirements?
Scenarios / use cases? Can we extrapolate or interpolate?
- Physics and detector communities to provide input, via questionnaire
- Could use questionnaire from D0/CDF on run II data management needs,
would have to be enlarged
- Available tools: CERNLIB components, LHC++ components, ROOT, JAS,
OpenScientist, others (HippoDraw, Grace, various commercial tools)
- How to choose? evaluation according to requirements and use cases,
compliance with the architecture, choice of standards, programming
languages and technologies, resources, timescale
- Functionality to be considered: I/O, histogramming, fitting, plotting,
...
- I/O: Objy/DB, light-weight persistency, ROOT, Root I/O, ... Choice
depending on data types, volumes, access and selection patterns. Should
be as decoupled from the rest as possible; should be possible to use
the appropriate tool for each application, not necessarily always the
same one
- Histogramming: Root, HTL, JAS, OpenScientist, ... Some soon to be
interchangable, some offering new functionality. How important is the
association to the raw data?
- "Toupling": HEPODBMS event coolections and tags, Root trees and
Ntuples, HepTuple
- Fitting and minimisation: Minuit, Gemini, NAG, ... Must be
interchangeable
- Plotting: Paw, Root, HepInventor/HepExplorer, JAS, OpenScientist.
2D poorly addressed by commercial tools, PAW/ROOT paradigm still very
appealing. Interchangeability highly desirable
- Interactivity: Paw, Root, HepExplorer, Jas, OpenScientist. Paw/Root
paradigm very applealing, HepExplorer not convincing the end user,
JAS being considered as alternative, OpenScientist very interesting.
Scripting functionality very essential, possibly choice of more than
one. Could use SWIG to interface with Perl, Python, Tcl/Tk, ...;
CINT debated controversially
- Usability and performance, modularity and flexibility, maintainability
and extensibility, replaceability, restrictions imposed by languages
and standards, and by legacy software, resources and timescales
- Atlas should decide which tools to evaluate, LHC expts or CERN or
HEP should make an effort to coordinate analysis tools development
work
- Plans: start review of requirements, start evaluation exercise
|
Discussion
|
- The problem is that the replies to the questionnaires tend to be biased
- This discussion is years too late
- A Hep-wide evaluation and coordination may face serious problems
- Interoperability is a very desirable aim, but we really need to
understand what we are talking about. A Hep-wide standardisation effort
would be invaluable
- Information about physics channels, data volumes etc. still exist from
the CTP times, should still be available
- Q: What is the role of the CBNT? Will it be freezed with the Fortran
software? A: Should serve as a starting point for a more conforming,
more functional implemenation
- HEPCCC could also be considered as a body to discuss this question
- LHCb said to have partly and temporarily accepted the Root I/O scheme
- Some more perspective into the future, and forward looking judgement,
is required
|
Thu
| 14:05
| Topic
| J. Schwindling: MLPFIT: a tool to design and use
multi-layer perceptrons
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- Artifical neutral networks started in the 40ths, now widely used in
many areas. Hep started using them in the late 80ths for classification
(particle id, event classificiation, search for Higgs), track
reconstruction, trigger, function approximation
- Multi-layer perceptron: hidden layers between input layer and output
layer apply non-linear function to linear combination to layer above
them. Output neuron is a linear combination
- Theorems: this network can achieve function approximation, good
discrimination between signal and noise
- Learning phase: tuning of parameters to minimise a chisquare similar
function, requires first derivative of errors
- Learning methods: stochastic minimisation (linear model with fixed or
variable steps), or global minimisation. Stochastic minimisation
works sometimes badly on deterministic problems. Hybrid method by far
fastest
- MLPfit designed to be used both for functional approximation and
classification. Implementation: 3000 lines of C, precise (double
precision), fast, inexpensive (dynamic allocation of memory), easy
to use
- Performance: at least competitive with well known and widely used
packages (Jetnet, SNNS)
- MLPfit reads Ascii file with all parameters, writes Ascii files
containing its results, and Paw files. Use dgels from Lapack to solve
linear least squares problem
- MLPfit is callable, parameter passing via dedicated routines; Labview
and Paw interfaces available, too
- Plans: support the code, improve it, test it
|
Discussion
|
- Q: What about the argument against neural networks that you never
find something unexpected, and that you never exactly know what you
are doing? A: If it was deterministic, it was linear; one would we
missing the additional power of this non-linear approach
- Q: What about error estimates for the result of a neural network?
A: This is being studied by mathematicians, no satisfactory answer yet
|
Thu
| 14:50
| Topic
| D. Malon: Data mining
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- What does data mining mean? Examples of standard application domains
and questions. Essential characteristics is knowledge discovery, not
information retrieval. Commonly emphasis on patterns and models.
In HEP: some people mean random access to large amounts of data, others
to computationally intensive event-by-event analysis
- Unlikely that COTS solutions can fulfil our requirements
- Start from idea of accessing condensed data set on disk, navigating
to more comprehensive information on tertiary storage transparently.
However, arbitrary random access to tapes is to be discouraged
- Tag data base: basis for selection
- Issue has been addressed by DOE Grand Challenge Project; one aim is to
provide information about the number of tape mounts to be expected.
Possible to coordinate the staging requests from within the running
job, and from other requests. Idea is that the exact order of events
in processing does not matter (foreach vs. standard loop approach).
Easy implementation of parallelism. Corba widely used (can be hidden)
- Less ambitious models: data trains, carousels: looping continuously
through all data, serving events as they are retrieved from the media
- Preferred strategy is still to produce compressed disk-resident
data sets
- Rough sets being studied
|
Discussion
|
- Q: Are there implementations of rough sets? A: Begun to build a
prototype based on the Star tag database, but no results yet. One has
to worry about biases when concentrating on events which are disk
resident
- Once data base has been populated with N-tuples or Atlfast++ files,
we could try out some of these concepts
|
Thu
| 15:25
| Topic
| J. Hrivnac: Graphics for analysis
|
References
| Slides: HTML,
PS,
PDF,
AG
|
Summary
|
- Evolvability: need to ensure that system is future-proof
- Mission statement of graphics domain
- Design of graphics made with flexibility in mind - objects should not
know that they can be visualised, design independent on any particular
visualisation tool
- High-level graphics design; data input from event domain (Zebra tapes,
Objy database), or simulation (MC truth)
- Use cases: Object browser: tree a la MS Windows Explorer, drag-and-drop
functionality; end programmer. Automatic creation of visualisation base
classes for any given (non visualisation aware) classes. Scene
developer to write two classes, Scene and Rep
- Democracy of scenes - not all reps need to be implemented for all
scenes
- XML being used for exchange of Ascii information, found to be very
useful. Future: further application, more direct mapping between
Objy and XML files
- Current status of graphics: definitions: real and graphical objects,
operations on real and visualised objects, views, static objects,
streaming objects, volatile objects. Requirements: general, existence,
environment, functional requirements. Design done, underwent review.
Implementation: ED, modeler, resolver functional, will be upgraded to
improve design and functionality; object browser foundation and
prototype exist; Tree builder temporarily implemented
- Views: Event display like: AVRML (3D) fully implemented; Aravis
(ramped-up Arve graphics) implemented, reps to be implemented;
Atlantis (sophisticated ED) well implemented, problem with data access;
Wired (Java based ED working over the Web) well implemented.
Statistical/histogram views: HbookTuple implemented, quite obsolete;
HistOOgrams simply implemented, quite obsolete; AHTL (HTL histograms)
implemented, AOS (OpenScientist histograms) planned, AJas (JAS
histograms) planned, Orca (simple integrated environment) basically
implemented. Misc.: AsciiText fully implemented, XMLScene implemented,
Command
- Interface to data structures or classes need to be implemented,
participation from systems requested
- Documentation: Implementation guidelines, FAQ, design document,
package documentation exist
- Domain interface: Graphics is what you see on the screen; shows the
results of the analysis
|
Thu
| 16:00
| Topic
| S. Resconi: Status of ATLFAST++ in LHC++
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- Based on concept of makers, many of which are implemented
- Main steps done: eliminate dependencies from Root, check results,
put data into Objectivity, visualisation with HepExplorer
- Root dependency: containers replaced by STL and VArray; interface to
physics processes rewritten
- Comparison with Fortran version: checked on sample of 10'000 events for
two popular channels
- Data into Objectivity: following tag/event data model proposed by RD45,
one single container, no associations
- Read data from Objectivity: visualisation with HEPExplorer, perform
analysis map. Alternatively, read data with simple C++ program, being
used by Monarc - 100'000 events filled (1.7 GB events, 8.4 MB tags)
- Further refinements needed, still some Root classes in, more physics
channels to be checked; can be used to learn more about LHC++
components, and to evaluate analysis tools
|
Discussion
|
- What is the evolution of Atlfast in general? What should be
maintained in the long run? Should the fast simulation not be provided
by Geant4? More discussion needed in the physics community
- Repetition of Atlfast++ tutorial could be very useful
|
Thu
| 16:30
| Topic
| M. Dönszelmann: Java agents
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- User requirements: Access to data (may need to move jobs rather than
data); access to processing power (1300 machines as of today);
analysis jobs need to be restartable/checkpointable; ease of use for
physicists
- System requirements: move jobs rather than data (ship the code
securely, ship the state objects); run on multiple CPUs (platform
independent, merging of results); checkpointing
- Mobile agent paradigm: remote procedure calls vs remote programming;
the latter can deal with larger chunks. Serialisation and
de-serialisation required for transfer of state to another machine
- Java as implementation of mobile agents: platform independent, secure
execution, object serialisation, reflection. Several implementations
on the market
- Mobile and stationary agents - topologies: serial, parallel (merge
agent), parallelisation at various level
- Agents involved: Job, Result, Task, Data
- Prototype implementation in Cern School of Computing: 35 machines, each
with 150 MB data locally stored. Running small program in Java,
traversed 5 GB in less than 10 minutes (shipping 5 GB over 10 MB/s
takes 1.5 hours)
- Problems to integrate with Fortran/C++; also not easy to make fail
safe
|
Discussion
|
- Q: What if an agent does not come back? A: There are some ideas how
to cope with that, but it is not currently implemented in any of the
systems on the market
- Q: Should we understand the talk as a recommendation to skip C++ and
go to Java? A: Yes
- Q: Would it be possible to cope with the Java/C++ problem by
restricting the agents to manager functionality? A: Problem is that
the state needs to be preserved
- Q: In a scenario where platform independence is not important, what is
the advantage of the proposed scheme over one with, say, Condor and/or
PVM? A: No interception of system calls required when using Java agents
|
Fri
| 09:00
| Topic
| S. Fisher: Status of Spider
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- At last meeting, not too much optimism; now, work is suspended. Main
problem is idea of common project, not all experiments behind. IPT
would like to carry on with willing partners only
- Two projects: CodeCheck - no action required; no advantage to be gained
from Spider at this point in time
- SRT - work was driven by Atlas, CMS have started their own system.
Could do nothing, but in any case, we may need to go ahead without
formal interaction with others. Unlikely to have a HEP wide SRT, except
via informal contacts
- Criticism to the existing SRT
- If we go ahead, create requirements document (have benefitted from
contacts with other experiments); get it reviewed (offline and online
experts, innocent users); compare it to existing solutions, evaluate
them including their cost, next steps depend on the outcome of this
process
|
Discussion
|
- Q: Why don't we go ahead with the current SRT? A: Changes are required,
and there are maintenance problems
- Before we know the requirements, we cannot evaluate the tools
- Q: What is the time scale? A: Requirements in not much more than a week, review takes a couple of weeks, evaluation again a couple of weeks
- Q: What does Geant4 and LHC++ use? A: Plain make, this is one of their
problems
- Should not rule out SRT right now, if we need it we will find the
manpower
- Need to take conversion cost into account which are somewhat unclear
- Would be a pity not to be able to exploit IPT person power
- For maintenance, should focus on organisations, not persons
- Questionable whether a short-term evaluation can address all issues of
scalability and maintainability
- Existing commitments and milestones need to be respected
- Q: What is the functionality issue? A: The main point is that
everything is rebuilt for a release
- Should in any case stick to the idea of online and offline using the
same tools
- Should be discussed at next LCB workshop
|
Fri
| 09:40
| Topic
| RD Schaffer: Report from data base WG
|
References
| N/A
|
Summary
|
- Topics: production data base, fallback solution, monarc, detector
description
- Production data base workshop: 11/12 May at CERN. Broad variety of
solutions exploited now. Agreement of identification scheme for built
parts. Recommendations for central and local tools to follow
- Detector description: exploiting OIF (Object Interchange Format)
parser, infrastructure exists, now organising working meetings with
systems. Want to go for simplistic Ascii format, not XML right now
- Simple persistency (also with a possible fallback for Objy in mind):
small prototype of persistency tool based on Objy ideas to be released
in summer (RD45 meeting in July)
- Monarc test bed with Objectivity/DB: performance measurements on wide
scale distribution of data
|
Discussion
|
- Q: Why is the simple persistency going on? Are people not satisfied
with Objectivity? A: There are several motivations - one is to have
something lightweight which could easily be used on a portable, the
other is to evaluate the potential cost of a home-grown OODBMS
solution. This package is ODMG compliant
- This is a remarkable change of strategy
- Good news that risk assessment is taken seriously by RD45, up to LCB
to judge whether this is a reasonable strategy, there is a review
process in place
|
Fri
| 10:00
| Topic
| J. Hrivnac: Graphics WG
|
References
| N/A
|
Summary
|
- Discussion about plans and schedules
- XML proposal
|
Fri
| 10:05
| Topic
| G. Poulard: Reconstruction WG
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- Atrecon: huge amount of work for physics TDR, need consolidation
(stamped version as physics TDR reference one); difficult to reproduce
all physics TDR results. Evolution: keep running version, some
improvement requests (detector geometry), adiabatic evolution to C++
- New structure and strategy unclear due to ongoing changes
- OO projects: xKalMan++, iPatRec to be integrated into Arve in summer,
xKalman++ to be put into repository, and to be documented, new
clustering to be used
- New muon reconstruction code: integration to be expected in June
- Arve: What does it really mean? What does it mean to put software into
it?
- Access to data: user guide exists, digits from Geant3 (except
Tilecal)
- Detector description: AMDB exists, requirements needed - wait for
nomination of task leaders
- Muon identification: convergence needed between muon system and ID,
but otherwise well advanced
- OO Kalman filtering in L2 trigger software - what is the future of
Astra? Need to ensure consistency of geometry
- Convergence to common classes like geometry, cluster, space point,
track. Track class now proposed, will be documented on the Web,
dedicated discussion during next SW workshop
- End 99 milestone for reconstruction? ID in good shape, uncertainties
on calorimeters and muons
|
Discussion
|
- Q: Shouldn't one go ahead and create a prototype? A: It exists in the
repository
- We should decide now who is going to address the Arve questions, and
how to go along with the component model - this is a natural topic for
the architecture task force
- Q: Is it conceivable that Geant4 would store the digits in the same
format as Geant3? A: Unclear
|
Fri
| 10:30
| Topic
| L. Perini: World-wide computing WG
|
References
| N/A
|
Summary
|
- Introduction to Monarc
- Activity to bring together people interested in regional centres,
taking into account all boundary conditions. Presentations by
representatives of potential regional centres
- Interim thinking: RC should provide 10...20% of CERN's resources for
the respective experiment, natural limit in number. Service centers
are distinct from regional centres, serve eg. MC production
- Rough estimates of capacity in 2007: 700 k SpecInt95, 740 TB disks,
3 PB tape for a single experiment. To be understood: networking, role
of magnetic tape, relative importance of I/O to computation
- IN2P3: at 1/3 of CERN's CPU capacity; plan to stay tuned
- INFN: start from scratch, unclear whether there will be more than one
centre for the various experiments
- US-Atlas and Fermilab planning to cope with numbers projected by Les
Robertson
- Germany, Japan: discussions ongoing
- Simulation: written in Java, implements all basic ingredients
(including Objy, tapes, networks, ...) of LHC computing
- Testbed for Objectivity over wide area networks being built up in
CERN, Japan, Italy; using Atlfast++
|
Discussion
|
- Q: What are the plans for the accessibility of regional centres? A:
Idea is that they should be open to the experiment, possibly with
different priorities, but the discussion is not finished yet
|
Fri
| 11:15
| Topic
| J. Knobloch: LCB workshop: Atlas feed back
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
|
- Atlas contributions: Architecture, data persistency: Atlas event
scheme, detector description, rapporteur? World-wide computing:
Monarc, rapporteur from Atlas? Simulation: we are most advanced users
of Geant4, should nominate rapporteur. Analysis tools: Experience from
Atlas, Atlas part of Monarc, rapporteur from Atlas?
|
Fri
| 11:20
| Topic
| H. Meinhard: Workshop summary
|
References
| Slides: HTML,
PS,
PDF,
PPT
|
Summary
| (see slides)
|
Fri
| 11:50
| Topic
| N. McCubbin: A few preliminary remarks
|
References
| N/A
|
Summary
|
- E-mail: N.McCubbin@rl.ac.uk
- Next points assume approval by CB
- see CV for brief summary of background
- Transition period of Norman: steady state (~ 80%) by October '99
- Software effort and OO: serious lack of manpower. OO is hoped to
overcome Brookes law (throwing people at a late software project will
making it even later)
- Attempting encapsulation is not new, but Fortran Common blocks made it
difficult
- Students come with knowledge of modern methods, and need jobs after HEP
- Software MOU: Raises profile of software to the one of detectors
- What's new in LHC software? Algorithmic complexity (eg. to fit a track)
not much changed, but all the rest (data volumes, CPU power, size of
collaboration, precision, software lifetime, computing environment,
...)
- Mission of Atlas: physics. Physics and software must be in symbiosis,
community (singular!) at large responsible
- A lot of pieces out there, architecture task force must address how
to organise this, and how to go forward. Suggestions, thoughts, ideas,
solutions, ... to Norman
- Training: How did YOU learn?
|
Discussion
|
- We are not in that bad a shape concerning the MOU - funding agencies
have been warned. However, the experience with the Geant4 MOU has been
painful
- Statement that the complexity did not increase challenged
- Term 'work unit' needs to be specified, probably much below the MOU
level
- Q: How will the organisation work in the next six months? A: Don't know
in detail yet, starting points are review report and action plan. In
transitional period, Jürgen will be very important
- Q: Do you agree that the systems should nominate system task leaders
before the overall coordinators are known? A: In principle yes, but
needs more thinking. Since it is a finite number of people involved,
they should be looked at individually
- Q: When should the architecture group start? A: As soon as possible,
its role is very important
- Q: What does architecture encompass? A: Complex issue... but it should
facilitate the partitioning of the software (MOU, work units). Not
detailed discussion about classes, not hardware architecture
- Q: How do conventions about how to build the work package get set up?
- Q: How do you see the quality control group, and its relation with the
architecture group? A: As soon as possible; Norman does not see a big
danger that the two groups will get in their way
- If the architecture task force does not define interfaces and glue, it
must instantiate a group which does
|