Atlas IO Performance - CHEP2010

DataSets

MC

AOD and ESD files containing 500 events. Files are produced by a RecoTrf job running in RTT on 4th September. Dataset containing both files is user.ilijav.IOperf.MC.AOD.ESD.allfiles.v1 .

Real Data

data10_7TeV.00159113

duration [s] LumiBlocks events
63182 528 6757197

Dataset name files total size [Gb] event size [kb/ev]
data10_7TeV.00159113.physics_L1Calo.merge.TAG.f275_m548_m547 2 1.68 0.26
data10_7TeV.00159113.physics_L1Calo.merge.AOD.f275_m548 274 834 129.44
data10_7TeV.00159113.physics_L1Calo.recon.ESD.f275 3408 8784 1363.09
data10_7TeV.00159113.physics_L1Calo.merge.RAW 3408 9930 1540.93

data10_7TeV.00155634.physics_L1Calo.merge.NTUP_BTAG.f260_p169 254 117 109
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_EGAMMA.f275_p179 532 92 14.22
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_JETMET.f275_p196 1133 47 7.32
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_TRIG.f275_p194 2297 914 ?
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_WZ.f275_p198 578 426 66.04

from all of the D3PDs we should now concentrate on EGAMMA and JETMET streams. (BTAGs are not usable till their compression is fixed and TRIG are quite specific in both configuration and usage)

For the local disk tests only subset of the real data files are used:

format dataset
AOD user.ilijav.IOperf.AOD.allfiles.v1
ESD user.ilijav.IOperf.ESD.allfiles.v1
EGAMMA user.ilijav.IOperf.EGAMMA.NTUP.allfiles.v1
JETMET user.ilijav.IOperf.JETMET.NTUP.allfiles.v5

AOD and ESD data sets contain:

  • one original file : lb0300.0001.1
  • one file with offending classes removed (MakeProject crashes on original files)
  • one file were baskets were not reordered

D3PD data sets listed above contain:

  • original files
  • one file containing simply merged all original files - what we were using
  • one merged file reordered byEntry and with optimized basket sizes - what we actually use
  • one merged file reordered byBranch and with optimized basket sizes

To be able to reproduce situation we had before we introduced an automatic basket reordering at the finalize of the Athena production jobs I've manually rewritten real data AOD's and ESD's. Corresponding (full) datasets are intended for large scale (not local disk) tests:

  • user.ilijav.IOperf.Unordered.data10_7TeV.00159113.physics_L1Calo.merge.AOD.f275_m548.v3/
  • user.ilijav.IOperf.Unordered.data10_7TeV.00159113.physics_L1Calo.recon.ESD.f275.v3/

if you have enough space please replicate them as soon as possible as now they exist only on scratch disk.Their size should be the same as of original datasets.

Tests

Local disk

When doing this kind of test it is very important:
  • machine to be completely free from any other activity
  • to always use "taskset -c N" to fix the job to the same core as different cores can show quite different performance.
  • to always clean the file that is read from the cache. On linux systems you can use following tools available in athena environment :
    • releaseFileCache.exe
      usage: releaseFileCache <filename>
      removes from memory all the cache associated to the file given.
    • checkCache.exe
      usage: checkCache <filename>
      prints out all the cached blocks of the file given. warning: for a large file that may mean a big printout.
  • for MacOS/Windows send me a mail and I'll send you details how to clean the cache.

Reading/Writing from Athena

Purpose of these tests is to show pure CPU time needed to read AOD/ESD files. We also want to show that IO performance depends on object size and object complexity.

For this series of tests we will use both MC and Real Data(8 tests in total: 2x ASD,EOD 2x read/write 2x MC, Real data). One file of each kind is enough.

Numbers needed (for each collection separately) are: collection name, collection size, its compression factor, total time to read in and root read time.

While the same numbers can be collected in case of writing I can't extract any meaning from them as actual writing is delegated to commit procedure which is not instrumented.

split level, zip level, basket resize options will not be checked as these are deemed unafordable in CPU or disk

Relevant code may be found in svn://svn.cern.ch/reps/io_perf/Tools/LocalDisk/Athena

Reading from ROOT script

Purpose of this test is to show what is influence of file organization on IO performance, especially on real time and disk utilization. Here main test should be reading in 4 different scenarios:

  • reading full file
  • randomly chosen 10% events
  • randomly chosen 1% of events
  • just some branches
All of these should be done in 4 different ways:
  • file is unordered
  • file is reordered byEntry - current situation with AODs and ESDs
  • unordered file using TTreeCache
  • if possible to do in time - make files with all ROOT defaults - autoflush at 30 Mb, basket size optimized by root
  • for D3PDs file ordered by branch.

Relevant code may be found in : svn://svn.cern.ch/reps/io_perf/Tools/LocalDisk/Root

LARGE SCALE tests

It would be great to have this tests done in controlled environment but that will be difficult to have these days.

The tests results would be ideally:

  • on unordered data
  • on standard data
  • new default root settings (16.+ athena)

Results should include:

  • total run time
  • total data transfer or data transfer rate vs time
  • average or versus time CPU utilization
  • average or versus time events/s

Tests:

  • D3PD making - run AODToEgammaD3PD.py. in total 6 tests: 2x AOD/ESD , unordered data, official data, "feature" data
  • PROOF reading / analysis of D3PDs - 100%, 10%, 1%, - on unordered/fully optimized/ newRoot optimized D3PDs

DPM

xrootd

dCache

Results

Most of results will be collected in one google spreadsheet which is open for public editing and can be found here : https://spreadsheets.google.com/ccc?key=0AiPvgbRljNCodEFtVEtmRERBTExRb2xVcldNeXE1X2c&hl=en&pli=1#gid=1 Some parts of it are already filled. Please feel free to add your own results.

-- IlijaVukotic - 08-Sep-2010

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf 20100621_TOB_DataSurvey.pdf r1 manage 6115.6 K 2010-09-09 - 11:41 IlijaVukotic Survey of Data usage
Unknown file formatpptx Presentation.pptx r2 r1 manage 3847.0 K 2010-10-07 - 14:32 IlijaVukotic Submited version
Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2010-10-07 - IlijaVukotic
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback