Trigger AOD Size Reduction for Rel 13

TrigDecision and HLTResult in 13.0.30

A potential size reduction for 13.0.30 was to make the HLTResults in TrigDecision into DataLinks instead of embedded data members. This will not work because when trigger hypo's are re-run, the original HLTResult is re-written in StoreGate, thus both the old TrigDecision and the new TrigDecision objects will point to the same HLTResult.

To get around this, there seems to be two solutions:

  • keep the HLTResult as a embedded data member of TrigDecision - therefore each new TrigDecision object will contain the new HLTResult - and do not save the HLTResult separately
  • make the HLTResult in TrigDecision into a DataLink, then keep multiple copies of HLTResult in StoreGate - this requires re-writting of the TrigSteering/LoopbackConverterFromPersistency class so that the old HLTResult is not removed, and the new one gets some sensible name.

The first solution seems more robust because then there is no way to accidentally delete the HLTResult object which is necessary for TrigDecision, there is less book-keeping. Neither solution is feasible for a 13.0.30 pcache, so it will have to wait until 13.X.0, and we will have to live with the duplicated 10kB from TrigDecision.

-- AndrewHamilton - 16 Oct 2007

Results with new menu in 13.0.30.1 Sample A

Here is a table of the event size breakdown for 5 datasets (all numbers are kB/event based on 1000 events). The numbers are extracted using checkFile.py and checkFileParse.py, they are all within 5% of the 'correct' event size (due to checkFile.py underestimation):

Dataset (AOD) total size event truth calo indet muon met jet tau eg trigger
5011.J2_pythia_jetjet 187 6.0 31.3 25.7 26.2 3.7 3.3 36.1 1.2 6.5 42.1
5144.PythiaZee 175 6.4 25.6 20.7 19.0 3.9 3.3 33.8 1.4 5.6 49.7
5702.PythiaB_BsJpsiphi 220 6.0 29.0 23.9 26.3 32.0 3.4 37.9 0.8 3.4 51.2
6384.PythiaH120gamgam 179 6.4 26.6 22.3 20.2 3.5 3.4 32.9 1.3 5.0 51.5
5200.T1_McAtNlo_Jimmy 418 8.1 53.7 36.8 45.8 17.6 3.8 65.6 3.4 18.2 158.0

-- AndrewHamilton - 08 Oct 2007

Results with new menu in 13.0.30

From /afs/cern.ch/atlas/project/RTT/Work/rel_0/val/build/i686-slc4-gcc34-opt/offline/RecExAnaTest/AthenaRecExCommon/RecExTrigTest_RTT_esdprod/319/AOD.pool.root.checkFile we can see that we are now back up to ~160kB/event for the trigger.

Using checkFileParse.py (to use remove the .txt), I get the following breakdown:

Summary of catagories:
8.132 eventinfo
63.745 truth
36.378 calo
44.364 indet
18.299 muon
3.324 met
64.346 jet
3.194 tau
17.471 egamma
162.405 trigger

With the further breakdown of the trigger items (407 events in file):

Trigger Items:
kB/evt   n items   class
0.023   536     POOLContainer_MuonFeature
0.039   814     POOLContainer_TrigMissingET
0.090   722     POOLContainer_TrigMuonEFContainer
0.135   407     TrigConf::Lvl1AODPrescaleConfigData_p1_AODConfig-0
0.268   2589    POOLContainer_TrigT2Jet
0.290   503     POOLContainer_CombinedMuonFeature
0.297   407     CTP_Decision_p2_CTP_Decision
0.306   1944    POOLContainer_TrigTau
0.371   6457    POOLContainer_DataVector<TrigL2Bjet>
0.375   15103   POOLContainer_TrigRoiDescriptor
0.392   6457    POOLContainer_DataVector<TrigEFBjet>
0.452   407     LVL1_ROI_p1_LVL1_ROI
0.626   407     TrigConf::Lvl1AODConfigData_p1_AODConfig-0
0.647   277     POOLContainer_DataVector<TrigEFBphys>
0.710   1928    POOLContainer_TauJetContainer_p1
0.779   2144    POOLContainer_TrigEMCluster
1.085   3093    POOLContainer_egammaContainer_p1
1.135   2009    POOLContainer_TrigTauCluster
1.508   1928    POOLContainer_TauDetailsContainer_tlp1
1.627   3927    POOLContainer_CaloClusterContainer_p2
2.153   3745    POOLContainer_DataVector<TrigElectron>
2.531   3868    POOLContainer_egDetailContainer_p1
2.831   5220    POOLContainer_DataVector<TrigPhoton>
4.002   407     HLT::HLTResult_p1_HLTResult_L2
4.116   407     TrigInDetTrackTruthMap_MyTrigInDetTrackTruthMap
6.818   407     HLT::HLTResult_p1_HLTResult_EF
10.879  407     TrigDec::TrigDecision_p1_TrigDecision
12.345  481     POOLContainer_DataVector<TrigL2Bphys>
13.855  407     TrigConf::HLTAODConfigData_p1_AODConfig-0
16.150  20827   POOLContainer_DataVector<TrigVertex>
32.086  6356    POOLContainer_Rec::TrackParticleContainer_tlp1
43.484  12100   POOLContainer_TrigInDetTrackCollection

Ideas to reduce size:

  • make HLTAODConfigData once per file, not once per event, savings ~14kB/event, (Till)
  • make HLTResult and LVL1Result ElementLinks, not pointers, in TrigDecision, savings ~11kB/event, (Andrew) - not possible in 13.0.30, see above
  • make TrigVertex pointer in TrigL2Bphys transient, savings ~10kB/event, (Julie)
  • make std::list and double m_cov[6] transient in TrigVertex, savings ~12kB/event, (Julie - but are there other TrigVertex clients?)

However, the biggest factor in the size change is the number of RoI's per event (50 TrigVertex, and 30 TrigInDetTrackCollection per event!) due to the change of menu. By disabling the BphysicsSlice, the size of the trigger EDM is significantly reduced. Table shows trigger size per event based on checkFileParse.py calculation after correction for checkFile estimate errors:

50 ttbar events AOD file size (kB) size from checkfile sum correction trigger size (kB/evt after corr.)
13.0.30 default 13719 13080 1.05 173
13.0.30 no BPhysSlice 10281 9650 1.06 107

-- AndrewHamilton - 26 Sep 2007

List of changes that have been made:

  • the L1 objects RecEmTauRoI, RecJetRoI, and RecEnergyRoI were removed from ESD and AOD because they are not expected to be used for user trigger analysis. (very small space savings expected)

  • Olya removed persistence of pointers from TrigTau to TauCluster and TrigInDetTrackCollection. (Expect to save most of 6.4 kB/event since TrigTau is small compared to track and cluster. Cluster and track can be reached by navigation instead.)

  • L2Result and EFResult in 12.0.6 (total 23.8 kB/evt) are replaced by new HLTResult class in 13.0.0. Expect significant size reduction, to be quantified. This saving will however be offset by the config data unless that can be stored per run. (See end of next section).

  • Ricardo Goncalo removed the TrigInDetTrack and TrigEMCluster pointers from TrigElectron, but had to add 3 ints and 1 float. He also did the necessary changes to TrigL2IDCaloHypo/Fex and the configuration files (4 April 2007).

  • Denis Damazio updated the "double" to "float"s for Egamma and Taus (TrigCaloEvent-00-01-23 in CVS, but not yet in the tag collector). He tested it with TrigT2CaloEgamma/Common and Egamma ntuple filling (CBNT_TrigT2Calo) and found only one error of 10^(-6) in Eratio. Another detail is that ESDs made before the change will not be readable back (17 April 2007)

  • Iwona has made the change to allow track pt thresholds to be set depending on the algorithm, so EF tracks can have a pt threshold of 1 GeV for most triggers (tau and full scan excepted). Expected space saving for EF is 16% of 25 kB = 4 kB; will be less because of tau exception.

  • Carlo arranged to drop tracks from SiTrack with only 3 space points as they are never used. est. saving 50% of tracks from L2 b-jet slice. There are approx. 3.4 J20 RoIs per event (J20 are processed by the b-jet trigger), so this affects 3.4/19.8 track collections, but these probably have more tracks than average, so expect to save at least 1.7 kB saving (approx 1kB/collection).

List of things we are currently testing:

  • doubles to floats
    • no problems expected
    • total savings expected ~10%
    • TrigInDetTrack needs double precision during reconstruction. Float precision is adequate for persistent rep so use TP separation
    • main classes now done

  • put min pt threshold on TrigInDetTracks in track collections
    • ~8% of tracks in all LVL2 track collections and about 16% of EF tracks have pt<1Gev;
    • most LVL2 low-pt tracks come from TRTxk - see below. Try: TRTMinRecPt = 1000 #MeV. Expected saving ~1.8 kB.

  • remove covariance matrix from Rec::TrackParticle
    • made possible with new version of TrackParticle. Needs change in the way they are created by EF ID code. Jiri & Andrew will look into this for 13.0.20.

  • drop the VxContainer
    • where vertex position is needed (e.g. for tau, bjet, bphysics), use Trk::RecVertex (or event just Trk::Vertex) as the object put in the HLT navigation and retrieved by the hypothesis. It holds the vertex position as a Hep3Vector (in the Vertex base class) and if needed, RecVertex holds the ErrorMatrix and FitQuality.
    • ok for high-pt electrons, photons, muons and single-prong taus. 3-prong taus make their own RecVertex so also ok.
    • to check: b-jets, b-physics
    • save most of 14.6 kB?
    • what needs to be done in practice?
      • drop VxContainer from output stream lists should be enough
      • preferably, ElementLinks should not be set either, otherwise trying to dereference them in AOD would invoke lengthy attempts to back track before failing.

  • Trigger configuration in AOD
    • currently (rel 13 nightlies) stored run by run
    • estimated size for 400 chains (based on CDF/D0 trigger) is 40kB; for rel 13 menu estimate approx 5kB.
    • should be stored per event instead, waiting for RDS/DM to provide infrastructure.

List of things we have thought of and the reason we are not pursuing them:

  • Use of ElementLinks in LVL2 trigger EDM because ElementLinks require StoreGate, which is not necessarily available online.

  • New navigation: feature request has been made to Tomasz to make it possible to selectively drop collections from one algorithm but not another, if the algorithms save data in the HLT navigation with their instance name as the label. It is not currently supported but is being considered. - deferred to rel 14

Understanding the size of the 12.0.6 trigger AOD

File: 100 top events: mc11.004100.T1_McAtNLO_top.digit.RDO.v11000401._00001
Run 12.0.6 RecExCommon with AllAlgs=False.
Total size 116 kB/event

RoI type Average number of RoIs per event
MU 0.6
EM 3.6
HA 5.0
JT 6.8

Top 10 classes which take 90% of size.

No. collections/evt Disk size/evt (kB) Class name
12.2 25.1 Rec::TrackParticleContainer
22.6 22.4 TrigInDetTrackCollection
1.0 18.0 EFResult
12.2 14.6 VxContainer
4.7 6.4 TrigTau
1.0 5.8 L2Result
8.2 4.1 TrigElectron
1.0 3.8 DataHeader
9.0 2.7 JetCollection
1.0 2.3 TrigInDetTrackTruthMap

Understanding the number of collections

  • take into account the average no. of RoIs of each type
  • three LVL2 tracking algorithms run for each EM RoI, so three collections produced per EM RoI
  • MU, HA (tau) RoIs each run one tracking algorithms
  • One track collection is produced for JT20 (not the lowest jet threshold, about half the jets pass it)
  • TrigElectron collection is produced for each threshold (except the lowest) for each EM RoI e.g. TrigInDetTrackCollection: 3 (track algos) x 3.6 (EM RoIs) + 5 (HA) + 0.6 (MU) + 0.5 (JT20) * 6.8 = 19.8

Understanding the size of tracks

  • According to Dmitry's slides from the December Trigger AOD meeting:
    TrigInDetTrack with end params contains 80 doubles (x 8 bytes) and 7 ints (x 4 bytes) => 668 bytes/track
  • According to checkFile.py on the 100 top events above, the memory size (not compressed disk size) of TrigInDetTrackCollections is 34 kB/event
  • According to the CBNT (see below) there are approx 57 tracks per event.
  • 57 x 668 = 38 kB/event
    • Not too far from the actual size shown by checkFile.py
    • Difference: perhaps some tracks don't have end params?

The effect of pointers

  • TrigTau contains pointers to a TrigInDetTrackCollection and a TrigTauCluster. Since POOL/ROOT follows the pointers and inserts the actual objects, they are included in the TrigTau. Clearly these objects dominate the size of the TrigTau and therefore removing the pointers should reduce it to a negligable size on disk.

Details of Testing

Track Pt Cut

Made a CBNT by adding doWriteCBNT=True to my job options file. Looked at the pt of the tracking collections in ROOT by doing:

  • TFile* f=new TFile("ntuple.root")
  • TTree* t=(TTree*)f->Get("CollectionTree")
  • t->Draw("T2IdPt","(T2IdPt>-20)&&(T2IdPt<100000)")
  • t->Draw("T2IdPt","(T2IdPt>-20)&&(T2IdPt<100000)&&(T2IdAlgo==1)")
See TrigInDetTrack.h for definitions of T2IdAlgo variable: SITRACKID=1, IDSCANID=2, TRTLUTID=3, TRTXKID=4

In 100 events, here are the number of LVL2 tracks less than 1 GeV for each collection:

Collection No. Tracks < 1Gev No. Tracks < 100Gev
SITRACKID 0 1141
IDSCANID 34 3248
TRTLUTID 0 0
TRTXKID 428 1354
All Collections 462 5743

EF tracks: total 8257, no. of tracks with pt under 1 GeV 1333.
Therefore potential saving of EF track collection size by 1 GeV min pt cut is 4kB.

Working area for 12.0.6

A trial work area for 12.0.6 has been created at /afs/cern.ch/atlas/software/dist/trials/v-a/sgeorge/
Topic attachments
I Attachment History Action Size Date Who Comment
Texttxt TauFromRDO.py.txt r1 manage 4.6 K 2007-04-04 - 10:29 AndrewHamilton jobOptions to run tau's in rel 13
Texttxt checkFileParse.py.txt r1 manage 12.1 K 2007-09-26 - 18:14 AndrewHamilton  
Texttxt notes.txt r2 r1 manage 1.8 K 2007-04-03 - 16:04 AndrewHamilton how to use test area
Edit | Attach | Watch | Print version | History: r28 < r27 < r26 < r25 < r24 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r28 - 2007-10-16 - AndrewHamilton
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback