Atlas Validation Reports 070501
-->

Atlas Validation Reports 070501


Reports of the Software Validation on Tuesday May 1, 2007

Manuel Gallas

Wednesday May 2, 2007

Phone Meeting

The next phone meeting will be on Tuesday, May 15, 2007 at 16:10 CERN time (from 16:10 to 17:40).

The hour of the meeting will be probably changed to 16:30 pm to allow people to attend SIT and sw-validation meetings. It will be announced by e-mail if this is the case.

Meeting Coordinates:

ATLAS software validation

(Manuel Gallas)

Dial-in numbers: +41227676000 (Main) Access codes: 0132753 (Participant) Participant site: https://audioconf1.cern.ch/call/0132753

IMPORTANT: send the report of your domain/detector in a written format preferably before Monday at noon (in this way the report to the SPMB will contain the most updated information).


Software validation coordination

Report on the actions from the previous meeting:

Actions:*

  • Follow up those items described in the A.O.B section: 1 RTT and ATN documentation and examples (links from the Wiki for sw validation)

2 The mailTo , doc, xml tags can be used right now. There are examples working

3 RTT classification tags (ongoing)

4 Counting the ERROR lines /CPU using the existing RTT machinery (under discussion)


Report from the physics validation activities


Report from the software domain and sub-detector software

Core Services (Paolo Calafiura)

No report.

Database (David Malon)

No report.

Infrastructure (Fred Luehring)

Right now there are no open Software infrastructure issues for release 13

Testing frameworks:

ATN (Alex)

Nightlies were plagued by various problems last week (full AFS volumes, AFS connection pbs,full /tmp areas ...). Basically rel_2 and rel_4 were lost, with no reliable ATN results. Currently ATN runs on 11 platforms (all but "master" platforms), in 5 nightly "branches" (development, bugfix, validation, lcg, 12.0.X).

The test success rate is ~40%, except: 22% in development nightlies. Number of tests and success rate for bugfix nightlies:

ProjectNumber tests OKRate
AtlasCore | 17 | 94%

DetCommon | 2 | 0%

AtlasConditions | 6 | 66 %

AtlasEvent | 9 | 55 %

AtlasReconstruction | 18 | 33 %

AtlasSimulation | 6 | 66 %

AtlasTrigger | 76 | 23 %

AtlasAnalysis | 43 | 13 %

AtlasProduction | 18 | 16 %

RTT (Peter, Brinick)

The RTT has run on 100% of builds available to it. This week that constituted 9/20 possible runs, and gives a measure of the build success.

Date              Release           Platform           RTT ran         JobSuccess rate             Extra info
==================================================================================================================================
30th April      rel_1/bugfix         SLC4              Ongoing         81/236 (249 tot)         Started manually (no NICOS flag)
                 rel_1/bugfix         SLC3              Starts later    ---                     ---
                rel_1/val            SLC4              Ongoing         23/115 (249 tot)         Started manually (no NICOS flag)

29th April rel_0/bugfix SLC4 No --- NICOS failed to set "release ready" flag rel_0/bugfix SLC3 No --- NICOS failed to set "release ready" flag rel_0/val SLC4 Yes 38/249 Alex manually set "release ready" flag
28th April rel_6/bugfix SLC4 No --- Bad build + no "release ready" flag rel_6/bugfix SLC3 No --- Bad build + no "release ready" flag rel_6/val SLC4 No --- Bad build + no "release ready" flag
27th April rel_5/bugfix SLC4 No --- NICOS failed to set "release ready" flag rel_5/bugfix SLC3 No --- NICOS failed to set "release ready" flag rel_5/val SLC4 No --- NICOS failed to set "release ready" flag
26th April rel_4/bugfix SLC4 No --- Bad build + no "release ready" flag rel_4/bugfix SLC3 No --- Bad build + no "release ready" flag rel_4/val SLC4 Yes 33/247 Re-run (initial build bad)
25th April rel_3/bugfix SLC4 Yes 39/135 Bad build rel_3/bugfix SLC3 Yes 78/247 --- rel_3/val SLC4 Yes 72/247 ---
24th April rel_2/bugfix SLC4 Yes 75/247 --- rel_2/bugfix SLC3 No --- No build available rel_2/val SLC4 Yes 8/247 Bad build

Generators (Giorgos Stavropoulos)

No report

Simulation (Adele Rimoldi)

Status:

Apart from the problems with nightlies the simulation core seems to run fine. In the best RTT nightly (rel_1) 10/15 tests are OK and for the 5 tests that fail is not clear that there is a problem in the simulation core. We have ran one of the failed tests locally and the test pass. A new tag for G4AtlasApps will be added to the val in order to add new volumen to the SD of LAr.

It is important to be able to run the RTT tests locally in order to identify problems that are not related with the code we want to test.

Digitization (Sven Vahsen)

General status of digitization software in release 13

improvements:

  • Trigger people updated the LVL1 digi jobOptions, LVL1 crashes fixed
  • Muon RDO persistency is back (not fully understood) --> all subdetectors + LVL1 again running

main remaining issues:

  • crash in CSC_Digitizer::digitize_hit after a few events (observed in rel_3 val)
  • AtRndmGenSvc needs more than two seed values to fully save the Ranlux engine state. Changes in the service itself needed. Who will work on this?
  • still need to fully enforce that all digitization clients switch from digitization jobflags to jobProperties. Mixing these currently works, but plan to disable jobflags in not too long.
  • digi transform needs minor updates

RTT tests

overall status:

  • 8/11 digitization RTT "integration" tests now complete successfully
  • description of tests now available on new wiki: https://twiki.cern.ch/twiki/bin/view/Atlas/DigitizationValidation failing integration RTT tests:
  • pileup: RTT tests need new RTT feature (athena -p). RTT team notified.
  • full ATLAS: crash in CSC_Digitizer::digitize_hit after a few events (just observed in rel_3 val)
  • digi transform: transforms need some work in rel 13 subdetector RTT tests:
  • many digitization RTT tests (such as muondigiexample) have outdated jobOptions

Reconstruction (David Rousseau)

Status:

ESD, AOD and TAG writing technically runs (reading not tested). See long list of smaller problem in the wiki.

Issues:

  • ATN test : many RecExRecoTest are reported as failed when they in fact succeed. I'm in discussion with Alex Undrus to understand that. In some cases (but not all) this is because there is a "Traceback" which is caught (but only causes on unessential alg to be switched off). Also Alan Poppleton rightly complained that RecExRecoTest are done on only 2 events, and there is no muon in the first two events of the default test file (tt event). Since anyway, CPU time is dominated by initialisation time, the number of events will be increased to 5.
  • we want all RecExXYZTest RTT and ATN auto-mail to be collected in a HN forum, but this requires first ATN test to be grouped by package (like is already the case for RTT)

EDM (Davide Costanzo, RD Schaffer)

No report

PAT (Ketevi Assamagan, Tadashi Maeno)

No report

Inner Detector (Markus Elsing)

No report

LAr Calorimeter (Hong Ma, Guillaume Unal, Karim Bernadet (TBC))

Status: (from Karim)

I look at the ATN and RTT results for rel_1:

Tile Calorimeter (Sasha Solodkov)

No report

Muon Spectrometer (Steve Goldfarb, Lashkar Kashif)

Status:

ATN

AtlasReconstruction: 3 packages fail to build:

  1. CSC_ DHoughSegmentMakerAlg
  2. CSC_ DHoughSegmentMakerTool (The developer expected to solve the problems by Wednesday last week (April 25), but that has not been the case)
  3. MuGirl .Package manager's comment: MuGirl compilation fails because it references DCMathSegmentMaker, which has recently been moved to a different location. It has been fixed in the latest tag, but this tag cannot be collected yet because another change is being made. This change is running into problems of its own; the developers are working with David Rousseau to solve it.

RTT

Nectarios fixed some bugs in MuonEvtValidator and collected the new tag, but it still fails the RTT.

According to the list on the RTT results page at http://atlas-project-rtt-results.web.cern.ch/atlas-project-rtt-results/

not a single package out of 42 packages is running its RTTs succesfully. When I look into the log files for individual RTTs, I don't see a generic pattern for the failure, either. Each package seems to be failing for a unique reason. We need to sort this out before we start thinking about adding more RTTs.

Tracking (Wolfgang Liebig)

No report.

Trigger ( Ricardo Goncalo, Olga Igonkina)

No Report.

Production Transforms (Manuel Gallas)

Status:

  • The 12.0.6.4 cache is in production. First time that TAG, SAN and HPTV can be produced at the same time.
  • One of the internal tests included with the cache fails when we try to produce TAG, SAN, MergedAOD, and HPTV at the same time. This failure is not reproduced by the Full Chain Test in which we use different input files. No news concerning this item from the HPTV authors.
  • Another issue is that recoESD seems not working if Truth is Off (investigation ongoing)
  • The 12.0.6.5 nightlies open, deadline for the new cache 2nd May
  • Activation of the 13.0.0 production scripts starting right now.

A.O.B


-- Main.gallasm - 02 May 2007
Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2007-05-06 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback