Exploring the Streams Test Tags

Sanity checks

This page began as almost a query blog, a stream of consciousness sequence of queries, each prompted by the previous query. The hope is that these queries both reveal useful information about the streams test samples and provide some examples of query formulations that are not always trivial.

Let us imagine that we are interested in all events that passed the L2_e25i trigger. The first place to look is the inclusive electron stream, right?

Q2. ...every event that passed the 25 GeV electron trigger.


Try this query? Click here:

This looks okay, but did you see some events with LooseElectronPt1 = 0? This suggests that no loose electrons were found, even though these events passed the L2 e25i trigger--not shocking, but perhaps worth a look:

Q2. ...every event that passed the 25 GeV electron trigger with no loose electrons.


Try this query? Click here:

This suggests that 3-4% of events passing the L2 e25i trigger yield no reconstructed electrons--plausible, but perhaps the trigger folks might be interested in these events.

Let's continue with the e25i selection thread. With inclusive streaming, all events that satisfy L2_e25i should be in the inclusive electron stream, but perhaps we should sample another stream, say, the muons, to be sure:

Q3. ...every event in the inclusive muon stream that passed L2_e25i, but that does not appear in the inclusive electron stream.


Try this query? Click here:

This is potentially troubling--there are L2_e25i events in the muon stream that are missing from the inclusive electron stream--but perhaps there is a straightforward explanation: perhaps the production system simply has not yet uploaded these inclusive electron stream events. Let's look at the Stream field. It turns out that its lowest-order bit marks whether the event qualifies for the jet stream, its second-lowest-order bit marks whether it qualifies for the electron stream (third-lowest: muon; fourth-lowest: photon; fifth-lowest: tau).

Q4. ...check the "electron stream" flag for the anomalous events in the inclusive muon stream.


Try this query? Click here:

(When this query was run on 12 June 07, it appeared that there were 64 L2_e25i events in the inclusive muon stream that did not appear in the inclusive electron stream, 15 of which could plausibly arrive later, but 49 of which would probably never arrive (their electron stream bit is not set to 'Yes'). Why?

One thing we didn't check yet is whether, in the case of inclusive streaming, the data could be different in the different stream instances when events are written to more than one stream. Maybe later...

...so how does one handle the situation when one's events are spread across multiple streams? It is easy to apply the same selection to all streams, until one has to eliminate duplicates. With exclusive streaming, there should be no duplication, though, as with our surprise discovery that some L2_e25i events are not being written to the inclusive electron stream, this needs to be checked.

Let's first see how the selection works, without worrying about duplicates:

Q5. ...all events that satisfy L2_e25i, independent of (exclusive) stream.


Try this query? Click here:

Aside: The tag query web pages provide a means to apply the same selection to N streams without this tedious repetition.

Note the number of events returned. Now let's repeat exactly this, removing duplicates. If this operation reduces the event count, exclusive streaming is failing to be exclusive.

Q6. ...all distinct events that satisfy L2_e25i, independent of (exclusive) stream.

)
Try this query? Click here:

When the above two queries were run on 12 June, they returned exactly the same number of events, reassuring us that exclusive streaming is indeed exclusive, so this method of selecting across all streams is safe even without explicit duplicate removal.

Note on exclusive and inclusive streaming: When one selects across multiple streams in an exclusive streaming model, one does not, in principle (and, thus far, in practice), need to remove duplicates. For tag-based selections, this means that one does not need to use the (relatively expensive) DISTINCT operator. For inclusive streams, it is not obvious how to do duplicate removal on the database side: SELECT DISTINCT would suffice for removal of duplicates from returned tag attributes, but it would not treat selections that return pointers to AOD and ESD and RAW and so on as duplicates if they point to different streams' files.

Perhaps you are wondering why we are displaying luminosity block numbers--shouldn't run and event numbers be good enough to identify an event uniquely? Let's check.

Q7. ...count the total number of events, the number of events with distinct {run, event} numbers, and the number with distinct {run, event, lumiblock} numbers.


Try this query? Click here:

It appears that run and event numbers do not uniquely identify events, but that {run, event, lumiblock} information suffices.

Exercise: How would you check whether an event appears more than twice?

Q8. ...to be continued (feel free to enter your own query here)....

A8.


Try this query? Click here:

-- DavidMalon - 11 Jun 2007

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2007-06-25 - DavidMalon
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback