DATA ANALYSIS SOFTWARE TOOLS
REQUIREMENTS (Draft)
Rita Somigliana and Krzysztof Sliwa
Tufts University/ ATLAS
1.0 AnalysisTool Requirements
The analysis software tools should provide a user
with flexible means of accessing the desired objects and their data members
(attributes), perform manipulations and operations involving the selected
data members (analysis), visualize the results, store them if so desired,
and represent them in a variety of possible publication quality forms.
>>> Mission Statement in http://atlasinfo.cern.ch/Atlas/GROUPS/GRAPHICS/Texts/EventDisplay/
Sounds like a rather general definition of a program.
Is the analysis tool some special program or
just a way of assembling our standard components.
1.1 AnalysisTool (AnT) Functional Requirements
Design - an odd
sub heading
-
AnT design should be modular and reusable, and allow
modules addition and deletion without major changes to the program. >>>
New graphical clients (System Properties), Extensibility, Graphics
Architectural Choises, New Objects (Constrains), New Views (Constrains)
- no more so than rest of our software
-
AnT should save and restart analysis procedures in the
same state as at the exit time. >>> Persistency ????
-
AnT should provide a standard mechanism to store information
and operations executed in each analysis procedure (i.e. information about
a dataset, selection cuts, calibration data used - if attributes were re-calculated
in an analysis job) to allow their recalculations with identical results.
>>> Persistency -
no more so than rest of our software
-
AnT should provide a standard mechanism to store information
on any errors encountered in any data manipulation (i.e. fitting, mathematical
manipulations, display). The information should be stored in an object
generated by the data operations - no more
so than rest of our software
-
AnT should provide a standard mechanism to append information
on the data related to an analysis (for example - criteria used to select
data and conditions used to collect data) to the analysis results. -
no more so than rest of our software
-
AnT should provide a standard mechanism to store and
view results of the preliminary, the intermediate, and the final stage
of analysis. >>> Persistency, Operations on Real
Objects (Introduction), Operations on Graphical Objects (Introduction)
- what stages?
-
AnT should allow viewing of results in the interactive
form and a possibility to save them, if needed, in a standard format for
possible inclusion in informal and formal publications. >>>
Interactivity, Presentation Graphics, Graphicsl Output Formats -
this must be graphics!
-
AnT should display one or more events simultaneously.
>>> Data on Views Coherency, Views Coherency,
Views Multiplicity - this must be graphics!
-
AnT should make it possible to plot, graph and represent
graphically in other ways results from simple and multiple data sets. >>>
many places - this must be graphics!
-
AnT should be easy enough to learn its basic functionality?s
in a short time (~ few hours). >>> inheritted
from Global Requirements - no more
so than rest of our software
Implementation -
another odd heading
-
AnT should allow reading different input data formats
and writing different output data formats. Conversion routines should be
available to the read/write objects in various formats for easy interfacing
with outside software packages. >>> Graphical Output
Formats, Data Sources Choise -sounds
like a general persistency requirement
-
AnT should provide a variety of mathematical manipulation
packages as well as a graphical display packages. -
may need to interface to maths packages
-
AnT should provide a default (native) mathematical manipulation
and graphical display packages.
-
AnT should apply standards to data representations,
printing files (i.e. postscript format), graphical outputs (i.e. wrl, gif,
ps), interfaces (i.e. GUI requirements). The standards should be either
industrial or internal (CERN-wide, for example) and should be strictly
applied. >>> Graphical Output Formats -
graphics again
-
AnT response time should be reasonably fast. >>>
Speed, Progressivity, Interuptability
Objects Definitions
-
Real objects should allow direct access to their attributes
(member data and functions in case of C++) to permit creation of new objects
and customization of operations on the objects. >>>
Operations on Real Objects (Introduction), Creation
-
A standard mechanism should be defined to display associations
between objects in the database (if a database is being used as input)
and browse across objects, following associations, without user?s prior
knowledge of the database schema. >>> Genealogy
-
Graphical objects should be generated from operations
on real objects for displaying operations. >>> Operations
on Real Objects (Introduction), Operations on Graphical Objects (Introduction)
-
Library of high level objects should be available to
operate on real data and graphical objects (i.e. plotting objects). >>>
Library of Objects
Setups
-
AnT should provide default setups to perform common
event analysis, or view entire events. Users should be allowed generate
and save their customized setups for later use and change setups during
analysis. >>> Setup Change, Setup Saving
1.2 Analysis Tool development environment:
-
AnT should be run from several platforms: Linux, Windows
NT, DEC-UNIX, SUN et cetera (ATLAS supported platforms). >>>
inheritted from Global Requirements
-
AnT should run from desktops as well as from central
servers. >>> inheritted from Global Requirements
-
AnT should be written in a scripting language, which
allows easy control of data selections and operations on data objects.
>>> probably not
1.3 Programming language specifications:
The programming language should:
-
support online operation (interactively and not interactively)
and batch mode >>> Interactivity,
Batch
-
support command line file operation
-
operate on objects >>> Operation
on Real Objects (Introduction), Operation on Graphical Objects (Introduction)
-
control the overall analysis (selection of operations
sequence on data, screen layout and plotting, fitting etc.) and operate
on multiple data objects if needed
-
support its own I/O format for objects and data structures
and have its own libraries operating on objects stored in its internal
format
-
perform complex mathematical operations on the data
(i.e. fitting algorithms, statistical functions, etc.). Operation in double
precision should be available
-
support dynamical linking to high level external language
routines (C, C++, Fortran or other approved high level language) >>>
Applications and Libraries
-
provide browsing capabilities through the data objects
and data structures to the outside world (users) with a varying degree
of modularity. >>> Genealogy, Event Surfing, Event
Selection
-
make algorithms used internally available to the outside
world, if needed
-
allow serial access to the input data (stream of data
from a storage media), and random access (individual objects within a larger
event stream) from various devices in a mass storage hierarchy. Output
data of different granularities to various output devices.
-
allow parallel processing for very large data streams
for improving the response time >>> Parallel Processing
-
support sophisticated packages to plot data
-
have debugging facilities >>>
inheritted from Global Requirements
1.4 Documentation
-
Analysis Tool system documentation should be available
on the World Wide Web (W3) in a well organized manner. >>>
inheritted from Global Requirements
-
A well-prepared tutorial, with ample examples, should
be made available on W3. It should be of the form of a self-taught course.
>>> inheritted from Global Requirements
-
The design diagram of Analysis Tool must be made available
in one of the standard design tool form adopted by ATLAS. >>>
inheritted from Global Requirements
1.5 References
-
Atlas Display Document (requirements section) - Julius
Hrivnac >>> http://atlasinfo.cern.ch/Atlas/GROUPS/GRAPHICS/Texts/EventDisplay/Requirements/
-
Run-II analysis software requirements - CDF/CD/D0 Run-II
Working Group