Main Web>TWikiUsers>IlijaVukotic>PresentationChep2010 (2010-10-07, IlijaVukotic)

Atlas IO Performance - CHEP2010

Atlas IO Performance - CHEP2010
- DataSets
  - MC
  - Real Data
- Tests
  - Local disk
    - Reading/Writing from Athena
    - Reading from ROOT script
  - LARGE SCALE tests
    - DPM
    - xrootd
    - dCache
- Results

DataSets

MC

AOD and ESD files containing 500 events. Files are produced by a RecoTrf job running in RTT on 4th September. Dataset containing both files is user.ilijav.IOperf.MC.AOD.ESD.allfiles.v1 .

Real Data

data10_7TeV.00159113

duration [s]	LumiBlocks	events
63182	528	6757197

Dataset name	files	total size [Gb]	event size [kb/ev]
data10_7TeV.00159113.physics_L1Calo.merge.TAG.f275_m548_m547	2	1.68	0.26
data10_7TeV.00159113.physics_L1Calo.merge.AOD.f275_m548	274	834	129.44
data10_7TeV.00159113.physics_L1Calo.recon.ESD.f275	3408	8784	1363.09
data10_7TeV.00159113.physics_L1Calo.merge.RAW	3408	9930	1540.93

data10_7TeV.00155634.physics_L1Calo.merge.NTUP_BTAG.f260_p169	254	117	109
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_EGAMMA.f275_p179	532	92	14.22
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_JETMET.f275_p196	1133	47	7.32
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_TRIG.f275_p194	2297	914	?
data10_7TeV.00159113.physics_L1Calo.merge.NTUP_WZ.f275_p198	578	426	66.04

from all of the D3PDs we should now concentrate on EGAMMA and JETMET streams. (BTAGs are not usable till their compression is fixed and TRIG are quite specific in both configuration and usage)

For the local disk tests only subset of the real data files are used:

format	dataset
AOD	user.ilijav.IOperf.AOD.allfiles.v1
ESD	user.ilijav.IOperf.ESD.allfiles.v1
EGAMMA	user.ilijav.IOperf.EGAMMA.NTUP.allfiles.v1
JETMET	user.ilijav.IOperf.JETMET.NTUP.allfiles.v5

AOD and ESD data sets contain:

one original file : lb0300.0001.1
one file with offending classes removed (MakeProject crashes on original files)
one file were baskets were not reordered

D3PD data sets listed above contain:

original files
one file containing simply merged all original files - what we were using
one merged file reordered byEntry and with optimized basket sizes - what we actually use
one merged file reordered byBranch and with optimized basket sizes

To be able to reproduce situation we had before we introduced an automatic basket reordering at the finalize of the Athena production jobs I've manually rewritten real data AOD's and ESD's. Corresponding (full) datasets are intended for large scale (not local disk) tests:

user.ilijav.IOperf.Unordered.data10_7TeV.00159113.physics_L1Calo.merge.AOD.f275_m548.v3/
user.ilijav.IOperf.Unordered.data10_7TeV.00159113.physics_L1Calo.recon.ESD.f275.v3/

if you have enough space please replicate them as soon as possible as now they exist only on scratch disk.Their size should be the same as of original datasets.

Tests

Local disk

When doing this kind of test it is very important:

machine to be completely free from any other activity
to always use "taskset -c N" to fix the job to the same core as different cores can show quite different performance.

to always clean the file that is read from the cache. On linux systems you can use following tools available in athena environment :

releaseFileCache.exe

usage: releaseFileCache <filename>
removes from memory all the cache associated to the file given.

checkCache.exe

usage: checkCache <filename>
prints out all the cached blocks of the file given. warning: for a large file that may mean a big printout.

for MacOS/Windows send me a mail and I'll send you details how to clean the cache.

Reading/Writing from Athena

Purpose of these tests is to show pure CPU time needed to read AOD/ESD files. We also want to show that IO performance depends on object size and object complexity.

For this series of tests we will use both MC and Real Data(8 tests in total: 2x ASD,EOD 2x read/write 2x MC, Real data). One file of each kind is enough.

Numbers needed (for each collection separately) are: collection name, collection size, its compression factor, total time to read in and root read time.

While the same numbers can be collected in case of writing I can't extract any meaning from them as actual writing is delegated to commit procedure which is not instrumented.

split level, zip level, basket resize options will not be checked as these are deemed unafordable in CPU or disk

Relevant code may be found in svn://svn.cern.ch/reps/io_perf/Tools/LocalDisk/Athena

Reading from ROOT script

Purpose of this test is to show what is influence of file organization on IO performance, especially on real time and disk utilization. Here main test should be reading in 4 different scenarios:

reading full file
randomly chosen 10% events
randomly chosen 1% of events
just some branches

All of these should be done in 4 different ways:

file is unordered
file is reordered byEntry - current situation with AODs and ESDs
unordered file using TTreeCache
if possible to do in time - make files with all ROOT defaults - autoflush at 30 Mb, basket size optimized by root
for D3PDs file ordered by branch.

Relevant code may be found in : svn://svn.cern.ch/reps/io_perf/Tools/LocalDisk/Root

LARGE SCALE tests

It would be great to have this tests done in controlled environment but that will be difficult to have these days.

The tests results would be ideally:

on unordered data
on standard data
new default root settings (16.+ athena)

Results should include:

total run time
total data transfer or data transfer rate vs time
average or versus time CPU utilization
average or versus time events/s

Tests:

D3PD making - run AODToEgammaD3PD.py. in total 6 tests: 2x AOD/ESD , unordered data, official data, "feature" data
PROOF reading / analysis of D3PDs - 100%, 10%, 1%, - on unordered/fully optimized/ newRoot optimized D3PDs

DPM

xrootd

dCache

Results

Most of results will be collected in one google spreadsheet which is open for public editing and can be found here : https://spreadsheets.google.com/ccc?key=0AiPvgbRljNCodEFtVEtmRERBTExRb2xVcldNeXE1X2c&hl=en&pli=1#gid=1 Some parts of it are already filled. Please feel free to add your own results.

-- IlijaVukotic - 08-Sep-2010

Attachments

Topic attachments
I	Attachment	History	Action	Size	Date	Who	Comment
pdf	20100621_TOB_DataSurvey.pdf	r1	manage	6115.6 K	2010-09-09 - 11:41	IlijaVukotic	Survey of Data usage
pptx	Presentation.pptx	r2 r1	manage	3847.0 K	2010-10-07 - 14:32	IlijaVukotic	Submited version

Topic revision: r7 - 2010-10-07 - IlijaVukotic

Main

Webs

Public webs

Welcome Guest

- Cern Search
- TWiki Search
- Google Search
Main All webs

Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback