Atlas IO Performance - CHEP2010
MC
AOD and ESD files containing 500 events. Files are produced by a
RecoTrf job running in RTT on 4th September.
Dataset containing both files is user.ilijav.IOperf.MC.AOD.ESD.allfiles.v1 .
Real Data
data10_7TeV.00159113
Dataset name |
files |
total size [Gb] |
event size [kb/ev] |
data10_7TeV.00159113.physics_L1Calo.merge.TAG.f275_m548_m547 |
2 |
1.68 |
0.26 |
data10_7TeV.00159113.physics_L1Calo.merge.AOD.f275_m548 |
274 |
834 |
129.44 |
data10_7TeV.00159113.physics_L1Calo.recon.ESD.f275 |
3408 |
8784 |
1363.09 |
data10_7TeV.00159113.physics_L1Calo.merge.RAW |
3408 |
9930 |
1540.93 |
data10_7TeV.00155634.physics_L1Calo.merge.NTUP_BTAG.f260_p169 |
254 |
117 |
109 |
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_EGAMMA.f275_p179 |
532 |
92 |
14.22 |
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_JETMET.f275_p196 |
1133 |
47 |
7.32 |
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_TRIG.f275_p194 |
2297 |
914 |
? |
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_WZ.f275_p198 |
578 |
426 |
66.04 |
from all of the
D3PDs we should now concentrate on EGAMMA and JETMET streams. (BTAGs are not usable till their compression is fixed and TRIG are quite specific in both configuration and usage)
For the local disk tests only subset of the real data files are used:
format |
dataset |
AOD |
user.ilijav.IOperf.AOD.allfiles.v1 |
ESD |
user.ilijav.IOperf.ESD.allfiles.v1 |
EGAMMA |
user.ilijav.IOperf.EGAMMA.NTUP.allfiles.v1 |
JETMET |
user.ilijav.IOperf.JETMET.NTUP.allfiles.v5 |
AOD and ESD data sets contain:
- one original file : lb0300.0001.1
- one file with offending classes removed (MakeProject crashes on original files)
- one file were baskets were not reordered
D3PD data sets listed above contain:
- original files
- one file containing simply merged all original files - what we were using
- one merged file reordered byEntry and with optimized basket sizes - what we actually use
- one merged file reordered byBranch and with optimized basket sizes
To be able to reproduce situation we had before we introduced an automatic basket reordering at the finalize of the Athena production jobs I've manually rewritten real data AOD's and ESD's.
Corresponding (full) datasets are intended for large scale (not local disk) tests:
- user.ilijav.IOperf.Unordered.data10_7TeV.00159113.physics_L1Calo.merge.AOD.f275_m548.v3/
- user.ilijav.IOperf.Unordered.data10_7TeV.00159113.physics_L1Calo.recon.ESD.f275.v3/
if you have enough space please replicate them as soon as possible as now they exist only on scratch disk.Their size should be the same as of original datasets.
Tests
Local disk
When doing this kind of test it is very important:
- machine to be completely free from any other activity
- to always use "taskset -c N" to fix the job to the same core as different cores can show quite different performance.
- to always clean the file that is read from the cache. On linux systems you can use following tools available in athena environment :
- releaseFileCache.exe
usage: releaseFileCache <filename>
removes from memory all the cache associated to the file given.
- checkCache.exe
usage: checkCache <filename>
prints out all the cached blocks of the file given. warning: for a large file that may mean a big printout.
- for MacOS/Windows send me a mail and I'll send you details how to clean the cache.
Reading/Writing from Athena
Purpose of these tests is to show pure CPU time needed to read AOD/ESD files. We also want to show that IO performance depends on object size and object complexity.
For this series of tests we will use both MC and Real Data(8 tests in total: 2x ASD,EOD 2x read/write 2x MC, Real data). One file of each kind is enough.
Numbers needed (for each collection separately) are: collection name, collection size, its compression factor, total time to read in and root read time.
While the same numbers can be collected in case of writing I can't extract any meaning from them as actual writing is delegated to commit procedure which is not instrumented.
split level, zip level, basket resize options will not be checked as these are deemed unafordable in CPU or disk
Relevant code may be found in svn://svn.cern.ch/reps/io_perf/Tools/LocalDisk/Athena
Reading from ROOT script
Purpose of this test is to show what is influence of file organization on IO performance, especially on real time and disk utilization.
Here main test should be reading in 4 different scenarios:
- reading full file
- randomly chosen 10% events
- randomly chosen 1% of events
- just some branches
All of these should be done in 4 different ways:
- file is unordered
- file is reordered byEntry - current situation with AODs and ESDs
- unordered file using TTreeCache
- if possible to do in time - make files with all ROOT defaults - autoflush at 30 Mb, basket size optimized by root
- for D3PDs file ordered by branch.
Relevant code may be found in : svn://svn.cern.ch/reps/io_perf/Tools/LocalDisk/Root
LARGE SCALE tests
It would be great to have this tests done in controlled environment but that will be difficult to have these days.
The tests results would be ideally:
- on unordered data
- on standard data
- new default root settings (16.+ athena)
Results should include:
- total run time
- total data transfer or data transfer rate vs time
- average or versus time CPU utilization
- average or versus time events/s
Tests:
- D3PD making - run AODToEgammaD3PD.py. in total 6 tests: 2x AOD/ESD , unordered data, official data, "feature" data
- PROOF reading / analysis of D3PDs - 100%, 10%, 1%, - on unordered/fully optimized/ newRoot optimized D3PDs
DPM
xrootd
dCache
Results
Most of results will be collected in one google spreadsheet which is open for public editing and can be found here :
https://spreadsheets.google.com/ccc?key=0AiPvgbRljNCodEFtVEtmRERBTExRb2xVcldNeXE1X2c&hl=en&pli=1#gid=1
Some parts of it are already filled. Please feel free to add your own results.
--
IlijaVukotic - 08-Sep-2010