RTT Improvements

Introduction

The RTT can play a very important role in improving the release validation process; it provides an automated way for software to be tested on a nightly basis. Recently, a cluster of 40 high performance CPU's was brought online at CERN for the purpose of running RTT jobs. For the purposes of this validation process, it is important to run a comprehensive tests of jobs.

  • Some improvements need to be made to present the results of the jobs, running in the RTT, so that it is easy to see what jobs are failing, etc.
  • In addition, the development of common tools will avoid duplication of effort.

Test configuration

  • (Simon) All test metadata should be included in the xml test configuration: the new tags proposed to classify the tests and any extra contact or description info which is required for managing the tests.

Presentation of results

Currently, results are presented in a tabular manner for each package in a given build (and compiler option). It would be useful if we could tell at one glance whether, say, all the reconstruction jobs ran or not.

Some suggestions:

  • (Srini) For a given nightly or release, there should be a single web page that displays the status of each job. A green light if passed, a red if it is failed. Clicking on the job should display the output of the post-processing. This includes:
    • status and result of each of the AI post-processing step described above,
    • time stamps showing when the test was executed
    • a link to the corresponding xml test descriptions
    • A link to the relevant histogram file - and with one more click should display those histograms.

  • (Srini) logical output structure based on tags (in xml test specs):
      rel_1/RTT-Test/<class>/<process>/<component>   (for the nightlies)
   or
      12.0.6/RTT-Test/<class>/<process>/<component>  (for the releases)

  • (Simon) Here is an idea for the future: provide RSS feeds for the nightly builds, test results, tag collector, Savannah, etc., so that subsystems can build their own software-monitoring dash boards. I feel that this approach would make it so much easier to remember what to check and when, get an immediate overview of status and problem areas, and to delegate the work to new people.

Post-processing of results

Comparing histograms

    • Seth and Sven have an elaborate system of comparing histograms produced by their RTT jobs
    • Denis (Damazio) has another system. Click here for a demonstration
      • On the left column, click on 12.0.4 gives you another column. Clicking on Zee electron results gives you colored display of the Kolmogorov-Smirnov results. Clicking on the sub-items give you the histograms.
    • Krzysztof (Ciba) also has a scheme that allows you to fit histograms in a dynamic display Take a look
    • It is trivial to compare two histograms in two files using ROOT. This is probably obvious to everyone. See a code snippet here.

Process log files

  • This includes:
    • Compare to reference log files.
    • Checks for Errors/Warnings
    • Checks for CPU and memory
    • Checks for size of output data
    • Other special checks specified by clients.

  • (Peter) What is being proposed should be put into a standard package of scripts and macros so it can be run by the RTT machinery. The RTT contains a number of tools that go in this direction - file greppers that allow search and veto strings. Assemblying the results into a desired output format would be simply one more RTT <action>.

Tracking results over time. (Simon)

It's not just histograms which we need to be able to track over time. Also simple numbers. For example, I could write out the memory leak for a trigger test job as a single number into a text file. This file has a URL. I would then run a job outside of RTT to grab this data from the URL for each nightly, and plot all available numbers in the system against the nightly or release they come from in chronological order. Or perhaps RTT could maintain these plots, provide me with a way of declaring which quantities I want monitoredlike this, and add data points each run itself.

I'd like to repeat my request that the results (job output) from RTT are stored much longer than currently. I would say at least the last 30 days, i.e. do not overwrite the nightlies like the build system does. Results are still useful even if the code has gone. Otherwise we will never have enough points to track a trend.

Common Tools

(Srini) ...existing tools available for automated histogram comparisons and subsequent displays to report results. The goal is to understand the requirements for such a tool for validation and develop a common tool. Since the Data Quality group also have similar requirements, this work should be coordinated with them to understand how to proceed forward in defining and implementing the tools needed for validation.

How to do Kolmogorov-Smirnov test in ROOT

To run this file (CompareHists.C), do root CompareHists.C,

CompareHists() {
TFile f1("ResolutionHistograms_EF_bb_default.root");
TFile f2("ResolutionHistograms_EF_bb_ZfinderOFF.root");

// Copy the two histograms into memory
// set the normalization for them = 1

f1.cd();
TH1F *default_hEta = (TH1F*) hEta->Clone();
default_hEta->Sumw2();
default_hEta->Scale(1.0/default_hEta->Integral());

f2.cd();
TH1F *ZfinderOFF_hEta = (TH1F*) hEta->Clone();
ZfinderOFF_hEta->Sumw2();
ZfinderOFF_hEta->Scale(1.0/ZfinderOFF_hEta->Integral());

// do the KS test. The X option gives the result from 1000 pseudo-experiments. 
// See the ROOT manual for details
Double_t EtAKSTest = ZfinderOFF_hEta->KolmogorovTest(default_hEta,"NDX");

// print out result of 1000 pseudo experiments
char fit[100];
sprintf(fit,"KS TEST = %lf",EtAKSTest);
std::cout << fit << endl;

}


-- VivekJain - 12 Mar 2007

Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r6 - 2007-03-23 - KrzysztofCiba
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback