CMSSW I/O in synchronous mode

This page discusses CMSSW I/O under the following conditions:
  • Running PAT tuple creation.
  • With the CMS I/O patches found here.
  • cacheSize set to 20MB.
  • readHint mode set to "application-only".
  • Reading out 40 events; after the first 20 events, the TTreeCache is activated.

Below are a few graphs showing the I/O activity, with the zoomed-in picture focusing on the area where TTreeCache fills its buffer (TTreeCache::!FillBuffer):

  • I/O activity on analyzed file over the whole run of the application:
    trace_appcache.png
    There are four main bands of activity - first, when ROOT opens the file. Second, when ROOT starts opening and processing the TTree itself (figuring out what it needs to deserialize, event metadata). Third is the read of the first event. After this, there's a long pause in I/O when the conditions data is loaded. Finally, there is the run through the file itself. The first 20 events are read using "normal" ROOT I/O. The TTreeCache buffer is then filled, and you see no more stalls for the rest of the file (as CMSSW has almost all the I/O it needs for awhile).
  • Focus on the time where ROOT was filling the TTreeCache buffers:
    trace_appcache_zoom_med.png
    This solid band of green represents ROOT synchronously reading data in from the file. Notice there are a lot of "stalls", as the I/O pattern appears completely random to the operating system - the OS completely fails to predict any future reads.
  • Notice that the disk-level parallelism is pretty low - at the greatest, it splits a single ~128KB I/O request into ~30 block requests, which get coalesced into 4 distinct requests out to the block device.

My claim would be that synchronous mode, while a clear improvement is still not optimal because it assumes ROOT knows best about how to organize requests. We have no knowledge of whether or not two pieces of non-sequential data in the file are actually sequential at the block layer. The point behind asynchronous I/O is to allow the block layer - which has a much better, if not complete, idea of the layout - determine what reads to coalesce or split.

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng trace_appcache.png r1 manage 48.9 K 2010-03-16 - 17:25 BrianBockelman  
PNGpng trace_appcache_zoom_med.png r1 manage 77.0 K 2010-03-16 - 17:25 BrianBockelman  
Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2010-05-13 - BrianBockelman
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback