Local player
Contents:
Introduction
The problem we try to solve here is the following:
- Transparency - i.e. the faculty to run the very same code on the local session and on the PROOF cluster - is not completely achieved. Examples are the TTree::fUserInfo list, active locally but ignored in PROOF, or the TSelector::fOutput list which can be ignored on local processing. The reasons for this is probably historical. But the net effect is that a TSelector-based code running locally may not run on PROOF; and debugging is very difficult.
- Taking advantage, locally, of the PROOF beauties requires starting an external daemon, connect via TCP/IP, etc. etc. All this can/should be done more efficiently on a local machine, possibly without using the network.
Here we present some ideas to provide a tool to allow to test locally the code to be run on PROOF;
as a by product we believe that this may render exploitation of local resources (in multi-CPU or
multi-core machines) automatic.
PROOF basics
The way PROOF distribute the work among the workers is schematically shown here:
The client sees the PROOF session and a TProof object
proof . When a TChain::Process command is
issued, TProof::Process is called: this makes sure that an instance of the PROOF player
(TProofPlayerRemote,
player ) exists and passes to it the instructions.
On the client
player performs the following tasks:
- runs the Begin() method of the required TSelector
- packs everything (selector code, data set info, input list, ...) and sends the process request to proofserv , the instance of TProofServ running on the master
- at the end of the query collects the replies from the master and runs the Terminate() method of the required selector.
On the master,
proofserv creates an instance of TProofPlayerRemote,
player , which does the following:
- creates an instance of the packetizer and validates it
- sends out the processing request to the workers
- waits for the completion of the query, distributing the packets via the packetizer upon requests from the workers
- merges the partial output lists received from the workers at the end of processing and send them back to the client.
On the workers,
proofserv creates an instance of TProofPlayerSlave,
player , which:
- runs the SlaveBegin() method of the required TSelector
- asks for packets to the master
- loops over the events into the packet, calling the Process method of the required TSelector
- runs the SlaveTerminate() method of the required TSelector
- sends back the output list to the master at the end of the query
Ideas for PROOF local
For local processing the network layer represented by the arrows among blocks on the previous picture
should not be necessary. The idea is to have the workers as threads. Of course, currently there are parts
of
ROOT which are known to be not thread-safe (e.g. CINT). We believe, however, that basic TTree processing
should be possible if some care is taken inside the selectors.
With two threads, the schematics will look like this:
The idea is to introduce a new implementation of TProofPlayer,
TProofPlayerLocal .
This class would be instantiated by TProof::Process (
local player ); the tasks of the
local player are:
- run the Begin() method of the required TSelector
- prepare the creation of the worker threads; the number of worker threads should not exceed the number of CPU's in the machine (except for testing; but be carefull with the interpretation of the results!). To avoid interferences, each thread should have its own output list, its own status flag and its own semaphore. A global semaphore is needed in addition to control the packet distributor.
- start the threads with, as argument, a pointer to a structure containing the following information
- fInput, pointer to the input list (or to a copy of, to avoid problems if one thread modifies it by mistake)
- fThrOutput, the output list owned and to be filled by the thread;
- fThrSem, the semaphore coordinating the operations of the thread;
- fSem, the global thread coordinating the packetizer;
- fElem, pointer to the TDSetElement to be analysed: should be filled with the packet when appropriate;
- fDone, a flag set to true by the packetizer when the thread is no more needed.
- coordinates the work distribution using the global semaphores fSem and thread-specific fThrSem; packets should be equal in size; granularity may be high, as the overhaed for distribution should be small; to be checked.
- merge the fThrOutput into fOutput when all the threads are done
- run the Terminate() method of the required selector.
The crucial part is the work distribution which should be very well synchronized with the semaphores.
Here is a schematic view of the coordinated work between the main thread and a worker thread:
--
GerardoGanis - 02 Oct 2006